jhb.se
robots.txt

Robots Exclusion Standard data for jhb.se

Resource Scan

Scan Details

Site Domain jhb.se
Base Domain jhb.se
Scan Status Ok
Last Scan2024-11-15T13:59:00+00:00
Next Scan 2024-12-15T13:59:00+00:00

Last Scan

Scanned2024-11-15T13:59:00+00:00
URL https://jhb.se/robots.txt
Domain IPs 151.101.130.132, 151.101.194.132, 151.101.2.132, 151.101.66.132
Response IP 151.101.194.132
Found Yes
Hash 2e220b48e2ef0f98dbc83a1ed02f0f1c7f323157b55ecf300faaabb4b9cab335
SimHash b6566f16cdfa

Groups

*

Rule Path
Disallow /litium
Disallow /kassa

cazoodlebot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

dotbot/1.0

Rule Path
Disallow /

gigabot

Rule Path
Disallow /

sogou spider

Rule Path
Disallow /

moget
ichiro

Rule Path
Disallow /

naverbot
yeti

Rule Path
Disallow /

baiduspider
baiduspider-video
baiduspider-image

Rule Path
Disallow /

youdaobot

Rule Path
Disallow /

mauibot (crawler.feedback+wc@gmail.com)

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

bingbot

Rule Path
Disallow /api/

Comments

  • For all robots
  • Block access to specific groups of pages
  • Block CazoodleBot as it does not present correct accept content headers
  • Block MJ12bot as it is just noise
  • Block dotbot as it cannot parse base urls properly
  • Block Gigabot
  • Block chinese, korean and russian bots
  • jhb.se