learndesk.us
robots.txt

Robots Exclusion Standard data for learndesk.us

Resource Scan

Scan Details

Site Domain learndesk.us
Base Domain learndesk.us
Scan Status Ok
Last Scan2024-10-19T17:03:36+00:00
Next Scan 2024-11-18T17:03:36+00:00

Last Scan

Scanned2024-10-19T17:03:36+00:00
URL https://learndesk.us/robots.txt
Domain IPs 2001:4860:4802:32::15, 2001:4860:4802:34::15, 2001:4860:4802:36::15, 2001:4860:4802:38::15, 216.239.32.21, 216.239.34.21, 216.239.36.21, 216.239.38.21
Response IP 216.239.32.21
Found Yes
Hash 0d92b73ef7bc4c30697129a3c981c2bae2078b6cd79e5022ba619a789e0f3621
SimHash 0b9f2d87df95

Groups

googlebot
bingbot
slurp
msnbot
mediapartners-google*
googlebot-image
yahoo-mmcrawler
ia_archiver
naverbot
yeti
yandexbot
yandexdirect
yandexdirectdyn
yandexmedia
yandeximages
yadirectfetcher
yandexpagechecker

Rule Path
Disallow /p/
Disallow */write_review/*
Disallow *writeareview$
Disallow *news-updates$
Disallow *news-updates/amp$
Disallow *where-to-buy$
Disallow */questions$
Disallow */questions/amp$
Disallow */substitutes$
Disallow */substitutes/amp$
Disallow */reports$
Disallow *chunked*
Disallow *expert-content*
Disallow /redirect*

wget

Rule Path
Disallow /

offline explorer

Rule Path
Disallow /

oozbot/setoozbot/oozbot/setoozbot

Rule Path
Disallow /

ezooms

Rule Path
Disallow /

yacybot

Rule Path
Disallow /

crawler4j

Rule Path
Disallow /

stress-agent

Rule Path
Disallow /

ultraseek

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

semrushbot-sa

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

mauibot

Rule Path
Disallow /