my-diary.org
robots.txt

Robots Exclusion Standard data for my-diary.org

Resource Scan

Scan Details

Site Domain my-diary.org
Base Domain my-diary.org
Scan Status Ok
Last Scan2024-11-16T01:03:34+00:00
Next Scan 2024-11-23T01:03:34+00:00

Last Scan

Scanned2024-11-16T01:03:34+00:00
URL https://my-diary.org/robots.txt
Redirect https://www.my-diary.org/robots.txt
Redirect Domain www.my-diary.org
Redirect Base my-diary.org
Domain IPs 104.21.11.219, 172.67.167.75, 2606:4700:3036::ac43:a74b, 2606:4700:3037::6815:bdb
Redirect IPs 104.21.11.219, 172.67.167.75, 2606:4700:3036::ac43:a74b, 2606:4700:3037::6815:bdb
Response IP 172.67.167.75
Found Yes
Hash e558413f8dcdfb9d29fbe9ae6e81fa8b8bf0a87bb3fe5e9f7a125d51cadad7f4
SimHash 61254371f780

Groups

googlebot
adsbot-google
mediapartners-google

Rule Path
Disallow /super-secret/

ahrefsbot
alphaseobot-sa
amazonbot
applebot
ccbot
chatgpt-user
claudebot
cohere-ai
criteobot
dataforseobot
dotbot
facebookbot
gptbot
ia_archiver
istellabot
mauibot
megaindex
mj12bot
omgili
omgilibot
perplexitybot
scrapy
seekportbot
semrushbot
serpstatbot
spbot
vagabondo
youbot
metadatascraper

Rule Path
Disallow /read/
Disallow /edit/

proximic

Rule Path
Disallow /edit/
Disallow /read/

Other Records

Field Value
crawl-delay 5

grapeshot

Rule Path
Disallow /edit/

Other Records

Field Value
crawl-delay 5

*

Rule Path
Disallow /editx/
Disallow /reset/
Disallow /resend/
Disallow /manage/
Disallow /login/

Other Records

Field Value
sitemap https://www.my-diary.org/sitemap.xml