inspircd.org
robots.txt

Robots Exclusion Standard data for inspircd.org

Resource Scan

Scan Details

Site Domain inspircd.org
Base Domain inspircd.org
Scan Status Ok
Last Scan2025-10-24T08:53:42+00:00
Next Scan 2025-11-23T08:53:42+00:00

Last Scan

Scanned2025-10-24T08:53:42+00:00
URL https://inspircd.org/robots.txt
Redirect https://www.inspircd.org/robots.txt
Redirect Domain www.inspircd.org
Redirect Base inspircd.org
Domain IPs 104.21.33.178, 172.67.147.173, 2606:4700:3034::6815:21b2, 2606:4700:3034::ac43:93ad
Redirect IPs 185.199.108.153, 185.199.109.153, 185.199.110.153, 185.199.111.153, 2606:50c0:8000::153, 2606:50c0:8001::153, 2606:50c0:8002::153, 2606:50c0:8003::153
Response IP 185.199.108.153
Found Yes
Hash 83e1aedd666d8740f024d75c87a32695e9d08b19aa9640a001109454152f3218
SimHash 71b9694180b4

Groups

*

Rule Path
Disallow /assets
Disallow /wiki

amazonbot
anthropic-ai
applebot-extended
bytespider
ccbot
chatgpt-user
claudebot
claude-web
cohere-ai
diffbot
facebookbot
friendlycrawler
google-extended
googleother
googleother-image
googleother-video
gptbot
imagesiftbot
img2dataset
omgili
omgilibot
perplexitybot
youbot

Rule Path
Disallow /

Other Records

Field Value
sitemap /sitemap.xml

Comments

  • www.robotstxt.org
  • From https://github.com/ai-robots-txt/ai.robots.txt v1.5