ge.globo
robots.txt

Robots Exclusion Standard data for ge.globo

Resource Scan

Scan Details

Site Domain ge.globo
Base Domain ge.globo
Scan Status Ok
Last Scan2024-05-17T10:41:55+00:00
Next Scan 2024-05-24T10:41:55+00:00

Last Scan

Scanned2024-05-17T10:41:55+00:00
URL https://ge.globo/robots.txt
Redirect https://ge.globo.com/robots.txt
Redirect Domain ge.globo.com
Redirect Base globo.com
Domain IPs 186.192.81.25
Redirect IPs 35.227.102.207
Response IP 35.227.102.207
Found Yes
Hash 684effb9e6f1b81de550858738bd368544c1cdfc897ce4876079581b8c548dc0
SimHash 2004d90ac5f1

Groups

*

Rule Path
Disallow /publieditorial
Disallow /eu-atleta/zcalendario/calendario.html
Disallow /servico
Disallow /dynamo
Disallow /beta
Disallow *globo-cdn-src/*

ccbot

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

Other Records

Field Value
sitemap https://ge.globo.com/sitemap/ge/sitemap.xml