plainsite.org
robots.txt

Robots Exclusion Standard data for plainsite.org

Resource Scan

Scan Details

Site Domain plainsite.org
Base Domain plainsite.org
Scan Status Ok
Last Scan2024-09-28T22:01:39+00:00
Next Scan 2024-10-05T22:01:39+00:00

Last Scan

Scanned2024-09-28T22:01:39+00:00
URL https://plainsite.org/robots.txt
Domain IPs 94.100.20.132
Response IP 94.100.20.132
Found Yes
Hash 371714baf7417e243ac7da5e2e9f4b9e3c854b945ce104f08d5ee785b3618969
SimHash 991652024e51

Groups

nutch

Rule Path
Disallow /

daum

Rule Path
Disallow /

linguee

Rule Path
Disallow /

grapeshot

Rule Path
Disallow /

baidu

Rule Path
Disallow /

yandex

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

slurp

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

semrushbot-sa

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

sogou

Rule Path
Disallow /

seekport crawler

Rule Path
Disallow /

mauibot

Rule Path
Disallow /

bjorndata

Rule Path
Disallow /

amazonbot

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

petalbot

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

panscient.com

Rule Path
Disallow /

yeti

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

googleother

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

dataforseobot

Rule Path
Disallow /

python-requests

Rule Path
Disallow /

oai-searchbot

Rule Path
Disallow /