guidetoislam.com
robots.txt

Robots Exclusion Standard data for guidetoislam.com

Resource Scan

Scan Details

Site Domain guidetoislam.com
Base Domain guidetoislam.com
Scan Status Ok
Last Scan2025-11-22T00:43:15+00:00
Next Scan 2025-12-22T00:43:15+00:00

Last Scan

Scanned2025-11-22T00:43:15+00:00
URL https://guidetoislam.com/robots.txt
Domain IPs 104.26.6.43, 104.26.7.43, 172.67.73.171, 2606:4700:20::681a:62b, 2606:4700:20::681a:72b, 2606:4700:20::ac43:49ab
Response IP 104.26.7.43
Found Yes
Hash 3c8b148952538ab189f66f16b3314fe29fbdbe4a5e966df7bd4d40eff3b97621
SimHash 4d45d7537713

Groups

*

Rule Path
Disallow /backend
Disallow /index.php/
Disallow /*?type=
Disallow /*?
Allow /

googlebot

Rule Path
Allow /

mediapartners-google

Rule Path
Allow /

adsbot-google

Rule Path
Allow /

googlebot-image

Rule Path
Allow /

googlebot-mobile

Rule Path
Allow /

msnbot

Rule Path
Allow /

yandex

Rule Path
Allow /

baiduspider

Rule Path
Allow /

youdaobot

Rule Path
Allow /

sogou web spider

Rule Path
Allow /

sogou inst spider

Rule Path
Allow /

sogou spider2

Rule Path
Allow /

pangusospider

Rule Path
Allow /

yisouspider

Rule Path
Allow /

easouspider

Rule Path
Allow /

Other Records

Field Value
sitemap https://guidetoislam.com/sitemap.xml