rainews.it
robots.txt

Robots Exclusion Standard data for rainews.it

Resource Scan

Scan Details

Site Domain rainews.it
Base Domain rainews.it
Scan Status Ok
Last Scan2024-09-21T09:25:18+00:00
Next Scan 2024-09-28T09:25:18+00:00

Last Scan

Scanned2024-09-21T09:25:18+00:00
URL https://rainews.it/robots.txt
Redirect https://www.rainews.it/robots.txt
Redirect Domain www.rainews.it
Redirect Base rainews.it
Domain IPs 212.162.68.91
Redirect IPs 23.36.49.139
Response IP 23.54.57.168
Found Yes
Hash 418d234bac42f8d5d852b52e18c6945177fd6fdf7dab4224aa6c7969dff3f41a
SimHash 7055995d2eb5

Groups

*

Rule Path
Allow /
Disallow /app/
Disallow /webview/
Disallow /programmi/
Disallow /collezioni/
Disallow /atomatic/
Disallow /StatisticheProxy/
Disallow /live/
Disallow /iframe/
Disallow /foto/
Allow /tgr/*/foto/
Disallow /*index.html?
Disallow /*chi-siamo.html?
Disallow /*live_frame.html
Disallow /tag?
Disallow /*?print$
Disallow */related-tag-iframe.html
Disallow /*?refresh_ce=$
Disallow /*?nxtep
Disallow /*.json$
Disallow /tgr/?set
Disallow /tgr/amp/
Disallow /dl/RaiTV/
Disallow /dl/rainews/redazioni/
Disallow /dl/rainews/ricerca.html
Disallow /dl/rainews/articoli/
Disallow /dl/rainews/media/
Disallow /dl/rainews/speciali/
Disallow /dl/rainews/live/
Disallow /dl/rainews/elezioni2021/
Disallow /dl/rainews/elezioni2020/
Disallow /dl/rainews/includes/
Disallow /dl/rainews/TGR/
Disallow /dl/rainews/json/
Disallow /dl/rai24/
Allow /dl/rai24/assets/images/*.jpg$
Allow /dl/rai24/assets/images/*.png$
Disallow /dl/Report/
Disallow /dl/grr/
Disallow /dl/bilancio
Disallow /dl/portaleRadio
Disallow /dl/portali/site/
Disallow /dl/js/
Disallow /dl/objects/
Disallow /dl/raisport/
Disallow /dl/tg3/
Disallow /dl/video/
Disallow /dl/portale/
Disallow /dl/docs/
Disallow /dl/doc/
Disallow /dl/analytics/
Disallow /dl/siti/
Disallow /dl/test/

twitterbot
facebookbot

Rule Path
Allow /*?nxtep

amazonbot
anthropic-ai
awariorssbot
awariosmartbot
bytespider
ccbot
chatgpt-user
claudebot
claude-web
cohere-ai
dataforseobot
diffbot
facebookbot
google-extended
gptbot
magpie-crawler
newsnow
news-please
omgili
omgilibot
peer39_crawler
peer39_crawler/1.0
perplexitybot
scrapy
turnitinbot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.rainews.it/sitemap.xml
sitemap https://www.rainews.it/dl/rainews/sitemap/fast/sitemap.fast.xml

Comments

  • Rai RadioTelevisione Italiana content is made available for your personal, non-commercial
  • use subject to our Terms of Service.
  • Use of any device, tool, or process designed to data mine or scrape the content
  • using automated means is prohibited without prior written permission from
  • Rai RadioTelevisione Italiana. Prohibited uses include but are not limited to:
  • (1) text and data mining activities under Art. 4 of the EU Directive on Copyright in
  • the Digital Single Market;
  • (2) the development of any software, machine learning, artificial intelligence (AI),
  • and/or large language models (LLMs);
  • (3) creating or providing archived or cached data sets containing our content to others; and/or
  • (4) any commercial purposes.
  • Contact https://www.rai.it/centroassistenza
  • CARTELLE SOTTO DL
  • SITEMAPS
  • ALTRI USER AGENT
  • AI BOTS