noticias.r7.com
robots.txt

Robots Exclusion Standard data for noticias.r7.com

Resource Scan

Scan Details

Site Domain noticias.r7.com
Base Domain r7.com
Scan Status Ok
Last Scan2024-11-05T18:59:24+00:00
Next Scan 2024-11-12T18:59:24+00:00

Last Scan

Scanned2024-11-05T18:59:24+00:00
URL https://noticias.r7.com/robots.txt
Domain IPs 184.87.193.83, 184.87.193.92, 2600:1413:b000:13::b857:c184, 2600:1413:b000:13::b857:c196
Response IP 23.45.207.166
Found Yes
Hash 09bdf2d2e083bc4f814b9924a278ea8291d954d4e2a3e639b6290b89e88fff96
SimHash f9147844ffd3

Groups

proximic

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

twitterbot

Rule Path
Allow /

semrushbot-sa

Rule Path
Disallow /

starkbot

Rule Path
Disallow /

scrapy

Rule Path
Disallow /

voluumdsp-content-bot

Rule Path
Disallow /

seekport crawler

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

*

Rule Path
Allow /
Disallow /guidebook/*
Disallow /guidebook-fazenda/*
Disallow /validacao/*
Disallow /validacao-de-templates/*
Disallow /embeds/
Disallow /index2.html
Disallow /supprod2022/*
Disallow /suporte/*
Disallow /media/*

Other Records

Field Value
sitemap https://noticias.r7.com/arc/outboundfeeds/sitemap-index/
sitemap https://noticias.r7.com/arc/outboundfeeds/sitemap-news-index/
sitemap https://noticias.r7.com/arc/outboundfeeds/sitemap-section-index/
sitemap https://noticias.r7.com/arc/outboundfeeds/sitemap-index-byday/