romagnaoggi.it
robots.txt

Robots Exclusion Standard data for romagnaoggi.it

Resource Scan

Scan Details

Site Domain romagnaoggi.it
Base Domain romagnaoggi.it
Scan Status Ok
Last Scan2024-11-12T04:05:54+00:00
Next Scan 2024-11-19T04:05:54+00:00

Last Scan

Scanned2024-11-12T04:05:54+00:00
URL https://romagnaoggi.it/robots.txt
Redirect https://www.romagnaoggi.it/robots.txt
Redirect Domain www.romagnaoggi.it
Redirect Base romagnaoggi.it
Domain IPs 109.168.105.232, 109.168.105.233, 195.231.59.141, 46.254.34.118, 52.144.69.196, 52.144.69.197, 95.110.224.15
Redirect IPs 109.168.105.232, 109.168.105.233, 195.231.59.141, 46.254.34.118, 52.144.69.196, 52.144.69.197, 95.110.224.15
Response IP 46.254.34.118
Found Yes
Hash 6be1b1bc00e7f945833a66afbb8171c0b57fccc50a1a77f79321d67e45f36fec
SimHash ff49815147b1

Groups

*
googlebot

Rule Path
Allow /
Disallow /~vda/
Disallow /~shared/do/
Disallow /~shared/cgi-bin/
Disallow /~do/
Disallow /~cgi-bin/
Disallow /do/
Disallow /cgi-bin/
Disallow /~test/
Disallow /~api/
Disallow /~ajax/
Disallow /~otp/
Disallow /~pixel/
Disallow /~empty/
Disallow /captcha/
Disallow /form/
Disallow /signup/
Disallow /commento/
Disallow /user/login/
Disallow /user/logout/
Disallow /user/sso/
Disallow /user/oauth/
Disallow /user/activate/
Disallow /user/reset/
Disallow /user/delete/
Disallow /user/unsubscribe/
Disallow /user/contents/
Disallow /user/news/
Disallow /user/relation/
Disallow /user/subscription/
Disallow /user/edit/
Disallow /user/self/
Disallow /~shared/styles/
Disallow /styles/
Disallow /~shared/scripts/
Disallow /scripts/
Disallow /medias/
Disallow /uploads/
Disallow /sf/
Allow /~shared/do/api/google/
Allow /~shared/do/api/google-newsstand/
Allow /~shared/do/api/amazon/
Allow /~shared/do/api/facebook/
Allow /~shared/do/api/samsung/

amazonbot

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

applebot-extended

Rule Path
Disallow /

awariorssbot
awariosmartbot

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

dataforseobot

Rule Path
Disallow /

diffbot

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

magpie-crawler

Rule Path
Disallow /

newsnow

Rule Path
Disallow /

news-please

Rule Path
Disallow /

omgili

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

peer39_crawler
peer39_crawler/1.0

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

scrapy

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

Other Records

Field Value
crawl-delay 3

Other Records

Field Value
sitemap https://www.romagnaoggi.it/sitemaps/sitemap.xml
sitemap https://www.romagnaoggi.it/sitemaps/sitemap_news.xml

Comments

  • COPYRIGHT NOTICE. The contents of this website are available only for personal, non-commercial
  • use. Use of any kind of device, tool, or process designed to data mine or scrape the content
  • using automated means is prohibited without prior written permission from
  • Citynews SpA. Prohibited uses include but are not limited to:
  • (1) text and data mining activities under Art. 4 of the EU Directive on Copyright in
  • the Digital Single Market;
  • (2) the development of any software, machine learning, artificial intelligence (AI),
  • and/or large language models (LLMs);
  • (3) creating or providing archived or cached data sets containing our content to others; and/or
  • (4) any commercial purposes.
  • Contact https://citynews.it for licensing.

Warnings

  • `host` is not a known field.