g1.com.br
robots.txt
Robots Exclusion Standard data for g1.com.br
Resource Scan
Scan Details
Site Domain | g1.com.br |
Base Domain | g1.com.br |
Scan Status | Ok |
Last Scan | 2024-05-22T16:13:29+00:00 |
Next Scan | 2024-05-29T16:13:29+00:00 |
Last Scan
Scanned | 2024-05-22T16:13:29+00:00 |
URL | https://g1.com.br/robots.txt |
Redirect | https://g1.globo.com/robots.txt |
Redirect Domain | g1.globo.com |
Redirect Base | globo.com |
Domain IPs | 186.192.81.143 |
Redirect IPs | 34.149.229.210 |
Response IP | 34.149.229.210 |
Found | Yes |
Hash | cd65bafc4305f27d35cb182cf2314785ba009bb1a6873757b1a19dbed253a6f8 |
SimHash | 6c558d5419b3 |
Groups
*
Rule | Path |
---|---|
Disallow | /jornalismo/g1/ |
Disallow | /_ssi/ |
Disallow | /teste-*.html$ |
Disallow | /beta/ |
Disallow | /componentes/ |
Disallow | /busca/* |
Disallow | /globo-news/jornal-globo-news/videos/v/globo-news-ao-vivo/61910/ |
Disallow | /globonews/playlist/globonews-ao-vivo.ghtml |
Disallow | *globo-cdn-src/* |
Disallow | /zeta/ |
Disallow | /content-aggregator/ |
Disallow | /jogos-app/ |
Other Records
Field | Value |
---|---|
sitemap | https://g1.globo.com/sitemap/g1/sitemap.xml |
sitemap | https://g1.globo.com/sitemap/Apuração/g1/sitemap.xml |