worldcrunch.com
robots.txt

Robots Exclusion Standard data for worldcrunch.com

Resource Scan

Scan Details

Site Domain worldcrunch.com
Base Domain worldcrunch.com
Scan Status Ok
Last Scan2024-06-25T10:15:18+00:00
Next Scan 2024-07-02T10:15:18+00:00

Last Scan

Scanned2024-06-25T10:15:18+00:00
URL https://worldcrunch.com/robots.txt
Domain IPs 104.26.8.104, 104.26.9.104, 172.67.74.156, 2606:4700:20::681a:868, 2606:4700:20::681a:968, 2606:4700:20::ac43:4a9c
Response IP 104.26.9.104
Found Yes
Hash 5e8dd16f41528ae92d2d3c887a4313ff2e791c491f305525c3148776adbe6ee5
SimHash 205d3840c053

Groups

*

Rule Path
Disallow /core/*
Disallow /r/*
Disallow /mnt/*
Disallow /res/*

ccbot

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://worldcrunch.com/sitemap.xml
sitemap https://worldcrunch.com/sitemap_news.xml
sitemap https://worldcrunch.com/sitemap_video.xml
sitemap https://worldcrunch.com/sitemap_sections.xml
sitemap https://worldcrunch.com/sitemap_tags.xml