my-diary.org
robots.txt

Robots Exclusion Standard data for my-diary.org

Resource Scan

Scan Details

Site Domain my-diary.org
Base Domain my-diary.org
Scan Status Ok
Last Scan2024-06-07T14:26:49+00:00
Next Scan 2024-06-14T14:26:49+00:00

Last Scan

Scanned2024-06-07T14:26:49+00:00
URL https://my-diary.org/robots.txt
Redirect https://www.my-diary.org/robots.txt
Redirect Domain www.my-diary.org
Redirect Base my-diary.org
Domain IPs 104.21.11.219, 172.67.167.75, 2606:4700:3036::ac43:a74b, 2606:4700:3037::6815:bdb
Redirect IPs 104.21.11.219, 172.67.167.75, 2606:4700:3036::ac43:a74b, 2606:4700:3037::6815:bdb
Response IP 104.21.11.219
Found Yes
Hash 08aa7d090ff37ad9d7b53487262eac529e7f24be796b6af3deba9d11d933f270
SimHash 71354171f785

Groups

googlebot
adsbot-google

Rule Path
Disallow /super-secret/

ahrefsbot
alphaseobot-sa
amazonbot
anthropic-ai
applebot
ccbot
chatgpt-user
claude-web
claudebot
cohere-ai
criteobot
dataforseobot
dotbot
facebookbot
gptbot
ia_archiver
istellabot
mauibot
megaindex
mj12bot
omgili
omgilibot
perplexitybot
scrapy
seekportbot
semrushbot
serpstatbot
spbot
twitterbot
vagabondo
youbot
metadatascraper

Rule Path
Disallow /read/
Disallow /edit/

proximic

Rule Path
Disallow /edit/
Disallow /read/

Other Records

Field Value
crawl-delay 5

grapeshot

Rule Path
Disallow /edit/

Other Records

Field Value
crawl-delay 5

*

Rule Path
Disallow /edit/
Disallow /reset/
Disallow /resend/
Disallow /manage/
Disallow /login/

Other Records

Field Value
sitemap https://www.my-diary.org/sitemap.xml