faqs.org
robots.txt

Robots Exclusion Standard data for faqs.org

Resource Scan

Scan Details

Site Domain faqs.org
Base Domain faqs.org
Scan Status Ok
Last Scan2024-09-21T15:21:40+00:00
Next Scan 2024-09-28T15:21:40+00:00

Last Scan

Scanned2024-09-21T15:21:40+00:00
URL http://faqs.org/robots.txt
Redirect http://www.faqs.org/robots.txt
Redirect Domain www.faqs.org
Redirect Base faqs.org
Domain IPs 199.231.164.68
Redirect IPs 199.231.164.68
Response IP 199.231.164.68
Found Yes
Hash 3a038e331078f7435589b7a90289516f83a78bad5a12be35ef53dc765f5a6b61
SimHash 4b5190904651

Groups

*

Rule Path
Disallow terms.html
Disallow /abstracts/mtc.class.php
Disallow /knowledge
Disallow /dictionary/js/function-wiki.js

gptbot

Rule Path
Disallow /

sitebot

Rule Path
Disallow /

businessdbbot

Rule Path
Disallow /

magpie-crawle

Rule Path
Disallow /

bender

Rule Path
Disallow /

jikespider

Rule Path
Disallow /

msnbot

Rule Path
Disallow /patents/imgfull/

mediapartners-google

Rule Path
Allow /knowledge