/.well-known/

Log In Sign Up

inspircd.org
robots.txt

Robots Exclusion Standard data for inspircd.org

Archived Snapshots

Resource Scan

Scan Details

Site Domain	inspircd.org
Base Domain	inspircd.org
Scan Status	Ok
Last Scan	2025-10-24T08:53:42+00:00
Next Scan	2025-11-23T08:53:42+00:00

Last Scan

Scanned	2025-10-24T08:53:42+00:00
URL	https://inspircd.org/robots.txt
Redirect	https://www.inspircd.org/robots.txt
Redirect Domain	www.inspircd.org
Redirect Base	inspircd.org
Domain IPs	104.21.33.178, 172.67.147.173, 2606:4700:3034::6815:21b2, 2606:4700:3034::ac43:93ad
Redirect IPs	185.199.108.153, 185.199.109.153, 185.199.110.153, 185.199.111.153, 2606:50c0:8000::153, 2606:50c0:8001::153, 2606:50c0:8002::153, 2606:50c0:8003::153
Response IP	185.199.108.153
Found	Yes
Hash	83e1aedd666d8740f024d75c87a32695e9d08b19aa9640a001109454152f3218
SimHash	71b9694180b4

Groups

*

Rule

Path

Disallow

/assets

Disallow

/wiki

amazonbot
anthropic-ai
applebot-extended
bytespider
ccbot
chatgpt-user
claudebot
claude-web
cohere-ai
diffbot
facebookbot
friendlycrawler
google-extended
googleother
googleother-image
googleother-video
gptbot
imagesiftbot
img2dataset
omgili
omgilibot
perplexitybot
youbot

Rule

Path

Disallow

/

Back to top

Other Records

Field

Value

sitemap

/sitemap.xml

Back to top

Comments

www.robotstxt.org
From https://github.com/ai-robots-txt/ai.robots.txt v1.5

Back to top