nfhs.org
robots.txt

Robots Exclusion Standard data for nfhs.org

Resource Scan

Scan Details

Site Domain nfhs.org
Base Domain nfhs.org
Scan Status Failed
Failure StageFetching resource.
Failure ReasonServer returned a client error.
Last Scan2024-08-27T22:18:28+00:00
Next Scan 2024-11-25T22:18:28+00:00

Last Successful Scan

Scanned2023-10-10T22:09:44+00:00
URL https://nfhs.org/robots.txt
Domain IPs 34.224.49.37
Response IP 34.224.49.37
Found Yes
Hash 7ea3120c780fb04fb6cea16d6b84fec8be31b244d1770aca00775319a8bd7224
SimHash e0955c057544

Groups

*

Rule Path
Disallow /

googlebot

Rule Path
Disallow

Other Records

Field Value
crawl-delay 10

adsbot-google

Rule Path
Disallow

Other Records

Field Value
crawl-delay 10

googlebot-image

Rule Path
Disallow

Other Records

Field Value
crawl-delay 10

bingbot

Rule Path
Disallow

Other Records

Field Value
crawl-delay 10

duckduckbot

Rule Path
Disallow

Other Records

Field Value
crawl-delay 10

facebot

Rule Path
Disallow

Other Records

Field Value
crawl-delay 10

pinterestbot

Rule Path
Disallow

Other Records

Field Value
crawl-delay 10

gptbot

Rule Path
Disallow

Other Records

Field Value
crawl-delay 10

twitterbot

Rule Path
Disallow

Other Records

Field Value
crawl-delay 10

Comments

  • See http://www.robotstxt.org/wc/norobots.html for documentation on how to use the robots.txt file
  • To ban all spiders from the entire site uncomment the next two lines: