sizeof.cat
robots.txt

Robots Exclusion Standard data for sizeof.cat

Resource Scan

Scan Details

Site Domain sizeof.cat
Base Domain sizeof.cat
Scan Status Ok
Last Scan2025-08-24T00:06:42+00:00
Next Scan 2025-09-23T00:06:42+00:00

Last Scan

Scanned2025-08-24T00:06:42+00:00
URL https://sizeof.cat/robots.txt
Domain IPs 46.226.105.97
Response IP 46.226.105.97
Found Yes
Hash 5afd71009ea28abb093a9a369da061f78f5fbac5caad155ce88218f1b678ba65
SimHash 54155bd1c690

Groups

*

Rule Path
Disallow /git/
Disallow /tags/
Disallow /super-secret-data/
Disallow /passwords.txt

ia_archiver
chatgpt
gptbot-user
facebookbot
adsbot-google
amazonbot
anthropic-ai
applebot
applebot-extended
awariorssbot
awariosmartbot
bytespider
ccbot
chatgpt-user
claudebot
claude-web
cohere-ai
dataforseobot
diffbot
facebookbot
friendlycrawler
google-extended
googleother
gptbot
img2dataset
imagesiftbot
magpie-crawler
meltwater
omgili
omgilibot
peer39_crawler
peer39_crawler/1.0
perplexitybot
piplbot
scoop.it
seekr
youbot

Rule Path
Disallow /

Other Records

Field Value
sitemap http://sizeof.cat/sitemap.xml