expatistan.com
robots.txt

Robots Exclusion Standard data for expatistan.com

Resource Scan

Scan Details

Site Domain expatistan.com
Base Domain expatistan.com
Scan Status Ok
Last Scan2024-11-15T23:35:35+00:00
Next Scan 2024-11-22T23:35:35+00:00

Last Scan

Scanned2024-11-15T23:35:35+00:00
URL https://expatistan.com/robots.txt
Domain IPs 173.230.146.53
Response IP 173.230.146.53
Found Yes
Hash e4bdf857292c4b232c0b5852913218b7319380f2d05f97c18466e50b57e1002f
SimHash 101e5b454563

Groups

*

Rule Path
Disallow /ajax
Disallow /share
Disallow /api
Disallow /widget
Disallow /check/humanity
Disallow /es/check/humanity
Disallow /pt/check/humanity
Disallow /cost-of-living/press-room

yandex

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /ajax
Disallow /share
Disallow /api
Disallow /widget

Other Records

Field Value
crawl-delay 120

msnbot

Rule Path
Disallow /ajax
Disallow /share
Disallow /api
Disallow /widget

Other Records

Field Value
crawl-delay 4

bingbot

Rule Path
Disallow /ajax
Disallow /share
Disallow /api
Disallow /widget

Other Records

Field Value
crawl-delay 2

sogou

Rule Path
Disallow /ajax
Disallow /share
Disallow /api
Disallow /widget

Other Records

Field Value
crawl-delay 40

sogou spider

Rule Path
Disallow /ajax
Disallow /share
Disallow /api
Disallow /widget

Other Records

Field Value
crawl-delay 40

sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm

Product Comment
sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm 07)
Rule Path
Disallow /ajax
Disallow /share
Disallow /api
Disallow /widget

Other Records

Field Value
crawl-delay 40

petalbot

Rule Path
Disallow /ajax
Disallow /share
Disallow /api
Disallow /widget

Other Records

Field Value
crawl-delay 20

proximic

Rule Path
Disallow /ajax
Disallow /share
Disallow /api
Disallow /widget

Other Records

Field Value
crawl-delay 20

claudebot

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

sistrix

Rule Path
Disallow /

grapeshot

Rule Path
Disallow /

exabot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

semrushbot-sa

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

zoombot

Rule Path
Disallow /

criteobot/0.1

Rule Path
Disallow /

imagesiftbot

Rule Path
Disallow /

dataforseobot

Rule Path
Disallow /

friendlycrawler

Rule Path
Disallow /

awariorssbot

Rule Path
Disallow /

awariosmartbot

Rule Path
Disallow /

amazonbot

Product Comment
amazonbot Amazon's user agent, on 26.04.2024 they were way too aggresive with their crawling
Rule Path
Disallow /

meta-externalagent

Rule Path
Disallow /

panscient.com

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /ajax
Disallow /share
Disallow /api
Disallow /widget

slurp

Rule Path
Disallow /ajax
Disallow /share
Disallow /api
Disallow /widget

Other Records

Field Value
crawl-delay 20

mail.ru_bot

Rule Path
Disallow /ajax
Disallow /share
Disallow /api
Disallow /widget

Other Records

Field Value
crawl-delay 40

Other Records

Field Value
sitemap https://www.expatistan.com/sitemapindex.xml

Comments

  • See http://www.robotstxt.org/wc/norobots.html for documentation on how to use the robots.txt file
  • Disallow: /ajax
  • Disallow: /share
  • Disallow: /api
  • Disallow: /widget
  • Not used by Yandex since 2018 # Crawl-delay: 40