noize.guru
robots.txt

Robots Exclusion Standard data for noize.guru

Resource Scan

Scan Details

Site Domain noize.guru
Base Domain noize.guru
Scan Status Failed
Failure StageFetching resource.
Failure ReasonCouldn't connect to server.
Last Scan2025-07-19T13:38:00+00:00
Next Scan 2025-10-17T13:38:00+00:00

Last Successful Scan

Scanned2025-02-27T02:13:51+00:00
URL https://noize.guru/robots.txt
Domain IPs 172.66.40.94, 172.66.43.162, 2606:4700:3108::ac42:285e, 2606:4700:3108::ac42:2ba2
Response IP 172.66.43.162
Found Yes
Hash 573122f5e07d61cf686efe70f49c522c50c29fab9fa64e8756c7e5ad27cb8f7e
SimHash 742f5b15c584

Groups

ai2bot
ai2bot-dolma
adsbot-google
amazonbot
anthropic-ai
applebot-extended
bytespider
ccbot
chatgpt-user
claudebot
claude-web
cohere-ai
diffbot
facebookbot
facebookexternalhit
friendlycrawler
google-extended
googleother
googleother-image
googleother-video
gptbot
iaskspider/2.0
icc-crawler
imagesiftbot
img2dataset
meta-externalagent
meta-externalfetcher
oai-searchbot
omgili
omgilibot
perplexitybot
petalbot
scrapy
timpibot
velenpublicwebcrawler
webzio-extended
youbot

Rule Path
Disallow /

awariorssbot
awariosmartbot
dataforseobot
magpie-crawler
meltwater
peer39_crawler
peer39_crawler/1.0
piplbot
scoop.it
seekr

Rule Path
Disallow /

wellknownbot

Rule Path
Disallow /

*

Rule Path
Disallow /api/
Disallow /auth/
Disallow /oauth/
Disallow /check_your_email
Disallow /wait_for_approval
Disallow /account_disabled
Disallow /signup
Disallow /.well-known/
Disallow /fileserver/
Disallow /users/
Disallow /emoji/
Disallow /admin
Disallow /user
Disallow /settings/
Disallow /about/suspended

Other Records

Field Value
crawl-delay 500

Comments

  • GoToSocial robots.txt -- to edit, see internal/web/robots.go
  • More info @ https://developers.google.com/search/docs/crawling-indexing/robots/intro
  • AI scrapers and the like.
  • https://github.com/ai-robots-txt/ai.robots.txt/
  • Marketing/SEO "intelligence" data scrapers
  • Well-known.dev crawler. Indexes stuff under /.well-known.
  • https://well-known.dev/about/
  • Rules for everything else.
  • API endpoints.
  • Auth/Sign in endpoints.
  • Well-known endpoints.
  • Fileserver/media.
  • Fedi S2S API endpoints.
  • Settings panels.
  • Domain blocklist.