cbn.globo.com
robots.txt

Robots Exclusion Standard data for cbn.globo.com

Resource Scan

Scan Details

Site Domain cbn.globo.com
Base Domain globo.com
Scan Status Ok
Last Scan2025-09-02T13:23:21+00:00
Next Scan 2025-09-09T13:23:21+00:00

Last Scan

Scanned2025-09-02T13:23:21+00:00
URL https://cbn.globo.com/robots.txt
Domain IPs 186.192.81.43
Response IP 186.192.81.43
Found Yes
Hash b95ee642314ff8569657afefae67d9ff65db72fcf3a616db9df1df4cc3da0196
SimHash 211518410153

Groups

*

Rule Path
Disallow /busca/
Disallow /beta/
Disallow /media/audio/

ccbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://cbn.globo.com/sitemap/cbn/news.xml
sitemap https://cbn.globo.com/sitemap/topic/cbn/sitemap.xml
sitemap https://cbn.globo.com/sitemap/cbn/sitemap.xml
sitemap https://cbn.globo.com/sitemap/home/cbn/sitemap.xml

Comments

  • robots.txt