social.sysmike.net
robots.txt

Robots Exclusion Standard data for social.sysmike.net

Resource Scan

Scan Details

Site Domain social.sysmike.net
Base Domain sysmike.net
Scan Status Failed
Failure StageFetching resource.
Failure ReasonCouldn't connect to server.
Last Scan2025-04-29T03:50:17+00:00
Next Scan 2025-07-28T03:50:17+00:00

Last Successful Scan

Scanned2024-09-09T03:48:48+00:00
URL https://social.sysmike.net/robots.txt
Domain IPs 116.203.48.235, 2a01:4f8:c0c:fbf1::1
Response IP 116.203.48.235
Found Yes
Hash 77960c53a2354670a115f88036837a4b7df1ca0b7efb0ecdee3c2ebb8487452b
SimHash 542c5b52a104

Groups

adsbot-google
amazonbot
anthropic-ai
applebot
awariorssbot
awariosmartbot
bytespider
ccbot
chatgpt-user
claudebot
claude-web
cohere-ai
dataforseobot
facebookbot
friendlycrawler
google-extended
googleother
gptbot
imagesiftbot
magpie-crawler
meltwater
omgili
omgilibot
peer39_crawler
peer39_crawler/1.0
perplexitybot
piplbot
seekr
youbot

Rule Path
Disallow /

wellknownbot

Rule Path
Disallow /

*

Rule Path
Disallow /api/
Disallow /auth/
Disallow /oauth/
Disallow /check_your_email
Disallow /wait_for_approval
Disallow /account_disabled
Disallow /signup
Disallow /.well-known/
Disallow /fileserver/
Disallow /users/
Disallow /emoji/
Disallow /admin
Disallow /user
Disallow /settings/
Disallow /about/suspended

Other Records

Field Value
crawl-delay 500

Comments

  • GoToSocial robots.txt -- to edit, see internal/web/robots.go
  • More info @ https://developers.google.com/search/docs/crawling-indexing/robots/intro
  • AI scrapers and the like.
  • https://github.com/ai-robots-txt/ai.robots.txt/
  • Well-known.dev crawler. Indexes stuff under /.well-known.
  • https://well-known.dev/about/
  • Rules for everything else.
  • API endpoints.
  • Auth/Sign in endpoints.
  • Well-known endpoints.
  • Fileserver/media.
  • Fedi S2S API endpoints.
  • Settings panels.
  • Domain blocklist.