globo.com
robots.txt
Robots Exclusion Standard data for globo.com
Resource Scan
Scan Details
Site Domain | globo.com |
Base Domain | globo.com |
Scan Status | Ok |
Last Scan | 2024-11-12T22:17:02+00:00 |
Next Scan | 2024-11-19T22:17:02+00:00 |
Last Scan
Scanned | 2024-11-12T22:17:02+00:00 |
URL | https://globo.com/robots.txt |
Redirect | https://www.globo.com/robots.txt |
Redirect Domain | www.globo.com |
Redirect Base | globo.com |
Domain IPs | 186.192.83.12 |
Redirect IPs | 35.231.58.70 |
Response IP | 35.231.58.70 |
Found | Yes |
Hash | 51949b387de79e71a275ac9cb5d92bf54f79b40b5a73aba58b7dcbc80305902c |
SimHash | a10dd9408542 |
Groups
*
Rule | Path |
---|---|
Disallow | /busca/ |
Disallow | /beta/ |
Disallow | /historico-home/ |
Disallow | *globo-cdn-src/* |
Disallow | /alt-a/ |
Disallow | /alt-b/ |
Disallow | /alt-c/ |
Disallow | /alt-d/ |
Disallow | /recomendado/ |
Disallow | /explore/ |
Other Records
Field | Value |
---|---|
sitemap | http://www.globo.com/sitemap-image.xml |
Comments