g1.globo.com
robots.txt
Robots Exclusion Standard data for g1.globo.com
Resource Scan
Scan Details
Site Domain | g1.globo.com |
Base Domain | globo.com |
Scan Status | Ok |
Last Scan | 2024-06-04T18:16:37+00:00 |
Next Scan | 2024-06-18T18:16:37+00:00 |
Last Scan
Scanned | 2024-06-04T18:16:37+00:00 |
URL | https://g1.globo.com/robots.txt |
Domain IPs | 34.149.229.210 |
Response IP | 34.149.229.210 |
Found | Yes |
Hash | cd65bafc4305f27d35cb182cf2314785ba009bb1a6873757b1a19dbed253a6f8 |
SimHash | 6c558d5419b3 |
Groups
*
Rule | Path |
---|---|
Disallow | /jornalismo/g1/ |
Disallow | /_ssi/ |
Disallow | /teste-*.html$ |
Disallow | /beta/ |
Disallow | /componentes/ |
Disallow | /busca/* |
Disallow | /globo-news/jornal-globo-news/videos/v/globo-news-ao-vivo/61910/ |
Disallow | /globonews/playlist/globonews-ao-vivo.ghtml |
Disallow | *globo-cdn-src/* |
Disallow | /zeta/ |
Disallow | /content-aggregator/ |
Disallow | /jogos-app/ |
Other Records
Field | Value |
---|---|
sitemap | https://g1.globo.com/sitemap/g1/sitemap.xml |
sitemap | https://g1.globo.com/sitemap/Apuração/g1/sitemap.xml |