staging2.horseracingnation.com
robots.txt

Robots Exclusion Standard data for staging2.horseracingnation.com

Resource Scan

Scan Details

Site Domain staging2.horseracingnation.com
Base Domain horseracingnation.com
Scan Status Ok
Last Scan2024-05-27T12:38:11+00:00
Next Scan 2024-06-03T12:38:11+00:00

Last Scan

Scanned2024-05-27T12:38:11+00:00
URL https://staging2.horseracingnation.com/robots.txt
Redirect https://www.staging2.horseracingnation.com:443/robots.txt
Redirect Domain www.staging2.horseracingnation.com
Redirect Base horseracingnation.com
Domain IPs 107.21.250.68, 3.222.5.150, 3.223.175.72, 44.207.39.248
Redirect IPs 107.21.250.68, 3.222.5.150, 3.223.175.72, 44.207.39.248
Response IP 107.21.250.68
Found Yes
Hash bdaeea083f7c09e6e699c11df340ba48c7aa43dff86a27660ec754ac00da363f
SimHash 30905dc1a464

Groups

*

Rule Path
Disallow /edit
Disallow /forgotpassword.aspx
Disallow /Login.aspx
Disallow /login.aspx
Disallow /signup.aspx
Disallow /session
Disallow /terms

amazonbot

Rule Path
Disallow /

applebot

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

the knowledge ai

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

omgili

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

youbot

Rule Path
Disallow /

Comments

  • See http://www.robotstxt.org/robotstxt.html for documentation on how to use the robots.txt file
  • FAANG
  • Google Bard AI
  • Microsoft / OpenAI
  • Others
  • Common Crawl