pythonjobs.com
robots.txt

Robots Exclusion Standard data for pythonjobs.com

Resource Scan

Scan Details

Site Domain pythonjobs.com
Base Domain pythonjobs.com
Scan Status Failed
Failure StageFetching resource.
Failure ReasonServer returned a client error.
Last Scan2024-09-15T01:03:02+00:00
Next Scan 2024-12-14T01:03:02+00:00

Last Successful Scan

Scanned2022-11-17T07:02:30+00:00
URL https://pythonjobs.com/robots.txt
Response IP 104.17.88.204, 104.17.89.204, 104.17.86.204, 104.17.85.204, 104.17.87.204
Found Yes
Hash d67c8391cbd88f6fb01b5365663d05100a13002d26761c8da0080ab648c049c5
SimHash a79dcc076771

Groups

*

Rule Path
Disallow /jobs/*/tracker
Disallow /jobs/*/preview
Disallow /jobs/*/applicants
Disallow /jobs/*/manage
Disallow /messages/*
Disallow /applicants/new
Disallow /backfills/latest_jobs
Disallow /auth/*
Disallow /clk/*
Disallow /employers/*
Disallow /c/*
Disallow /s/*
Disallow /e/*
Disallow /g/*
Disallow /n/*
Disallow /Salaries/*
Disallow /*?*page=
Disallow /*?*lat=
Disallow /*?*long=
Disallow /*?*sort=

yandeximages

Rule Path
Disallow /

yandex

Rule Path
Disallow /

easouspider

Rule Path
Disallow /

smtbot

Rule Path
Disallow /

pcore-http

Rule Path
Disallow /

bubing

Rule Path
Disallow /

companybook-crawler

Rule Path
Disallow /

wotbox/2.01

Rule Path
Disallow /

ccbot/2.0

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

ebibot

Rule Path
Disallow /

pcore-http/v0.24.5

Rule Path
Disallow /

testitest1

Rule Path
Disallow /

vegi bot

Rule Path
Disallow /

istellabot/t.1

Rule Path
Disallow /

ltx71

Rule Path
Disallow /

ltx71 - (http://ltx71.com/)

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

mauibot

Rule Path
Disallow /

velenpublicwebcrawler

Rule Path
Disallow /

Comments

  • See http://www.robotstxt.org/wc/norobots.html for documentation on how to use the robots.txt file
  • To ban all spiders from the entire site uncomment the next two lines:
  • ZR integration blocks
  • block search query params