globo.com
robots.txt

Robots Exclusion Standard data for globo.com

Resource Scan

Scan Details

Site Domain globo.com
Base Domain globo.com
Scan Status Ok
Last Scan2024-04-30T17:37:33+00:00
Next Scan 2024-05-07T17:37:33+00:00

Last Scan

Scanned2024-04-30T17:37:33+00:00
URL https://globo.com/robots.txt
Redirect https://www.globo.com/robots.txt
Redirect Domain www.globo.com
Redirect Base globo.com
Domain IPs 186.192.83.12
Redirect IPs 34.107.153.189
Response IP 34.107.153.189
Found Yes
Hash 51949b387de79e71a275ac9cb5d92bf54f79b40b5a73aba58b7dcbc80305902c
SimHash a10dd9408542

Groups

*

Rule Path
Disallow /busca/
Disallow /beta/
Disallow /historico-home/
Disallow *globo-cdn-src/*
Disallow /alt-a/
Disallow /alt-b/
Disallow /alt-c/
Disallow /alt-d/
Disallow /recomendado/
Disallow /explore/

ccbot

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

Other Records

Field Value
sitemap http://www.globo.com/sitemap-image.xml

Comments

  • robots.txt