horseracingnation.com
robots.txt

Robots Exclusion Standard data for horseracingnation.com

Resource Scan

Scan Details

Site Domain horseracingnation.com
Base Domain horseracingnation.com
Scan Status Ok
Last Scan2024-09-20T05:57:58+00:00
Next Scan 2024-09-27T05:57:58+00:00

Last Scan

Scanned2024-09-20T05:57:58+00:00
URL https://horseracingnation.com/robots.txt
Redirect https://www.horseracingnation.com:443/robots.txt
Redirect Domain www.horseracingnation.com
Redirect Base horseracingnation.com
Domain IPs 3.229.37.255, 34.197.70.114, 34.228.147.62, 52.206.73.46
Redirect IPs 3.229.37.255, 34.197.70.114, 34.228.147.62, 52.206.73.46
Response IP 34.228.147.62
Found Yes
Hash bdaeea083f7c09e6e699c11df340ba48c7aa43dff86a27660ec754ac00da363f
SimHash 30905dc1a464

Groups

*

Rule Path
Disallow /edit
Disallow /forgotpassword.aspx
Disallow /Login.aspx
Disallow /login.aspx
Disallow /signup.aspx
Disallow /session
Disallow /terms

amazonbot

Rule Path
Disallow /

applebot

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

the knowledge ai

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

omgili

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

youbot

Rule Path
Disallow /

Comments

  • See http://www.robotstxt.org/robotstxt.html for documentation on how to use the robots.txt file
  • FAANG
  • Google Bard AI
  • Microsoft / OpenAI
  • Others
  • Common Crawl