jobs.newscientist.com
robots.txt

Robots Exclusion Standard data for jobs.newscientist.com

Resource Scan

Scan Details

Site Domain jobs.newscientist.com
Base Domain newscientist.com
Scan Status Ok
Last Scan2024-06-16T20:32:53+00:00
Next Scan 2024-07-16T20:32:53+00:00

Last Scan

Scanned2024-06-16T20:32:53+00:00
URL https://jobs.newscientist.com/robots.txt
Redirect https://www.newscientist.com/nsj/robots.txt
Redirect Domain www.newscientist.com
Redirect Base newscientist.com
Domain IPs 52.31.116.32
Redirect IPs 151.101.130.217, 151.101.194.217, 151.101.2.217, 151.101.66.217
Response IP 199.232.46.217
Found Yes
Hash 278e1bd9aa7bbaf6601b7a5d86859700c3d179b28bb8fd0d9c00ab045069a074
SimHash 67d752b84d84

Groups

*

Rule Path
Disallow /nsj/session-img/
Disallow /nsj/invalid-request/
Disallow /nsj/document/
Disallow /nsj/analytics/
Disallow /nsj/apply-profile/
Disallow */nsj/searchjobs/*
Disallow */nsj/jobsrss/*
Disallow /nsj/jobsrss/*
Disallow */nsj/jbequicksignup/*
Disallow */nsj/emailjob/*
Disallow /nsj/your-jobs*
Disallow /nsj/external-redirect-registration/*
Disallow */nsj/previewjob/*
Disallow /nsj/en-us/session-img/
Disallow /nsj/en-us/invalid-request/
Disallow /nsj/en-us/document/
Disallow /nsj/en-us/analytics/
Disallow /nsj/en-us/apply-profile/
Disallow */nsj/en-us/searchjobs/*
Disallow */nsj/en-us/jobsrss/*
Disallow /nsj/en-us/jobsrss/*
Disallow */nsj/en-us/jbequicksignup/*
Disallow */nsj/en-us/emailjob/*
Disallow /nsj/en-us/your-jobs*
Disallow /nsj/en-us/external-redirect-registration/*
Disallow */nsj/en-us/previewjob/*

alexibot
alexibot
aqua_products
b2w/0.1
backdoorbot/1.0
baiduspider
blowfish/1.0
bookmark search tool
botalot
botrighthere
builtbottough
bullseye/1.0
bunnyslippers
ccbot
cheesebot
cherrypicker
cherrypickerelite/1.0
cherrypickerse/1.0
copernic
copyrightcheck
cosmos
crescent internet toolpak http ole control v.1.0
crescent
daumoa
dittospyder
dotbot
emailcollector
emailsiphon
emailwolf
erocrawler
exabot
extractorpro
ezooms
fairad client
flaming attackbot
foobot
gaisbot
getright/4.2
gigabot
harvest/1.5
hloader
httplib
httrack 3.0
humanlinks
infonavirobot
iron33/1.0.2
jennybot
kenjin spider
keyword density/0.9
larbin
lexibot
libweb/clshttp
linkextractorpro
linkscan/8.1a unix
linkwalker
lnspiderguy
lwp-trivial/1.34
lwp-trivial
mata hari
microsoft url control - 5.01.4511
microsoft url control - 6.00.8169
microsoft url control
miixpc/4.2
miixpc
mister pix
mj12bot/2.1
moget/2.1
moget
mozilla/4.0 (compatible; bullseye; windows 95)
msiecrawler
netants
nicerspro
nothing
nutch
offline explorer
openbot
openfind data gatherer
openfind
oracle ultra search
perman
propowerbot/2.14
prowebwalker
psbot
purebot
python-urllib
queryn metasearch
radiation retriever 1.1
repomonkey bait & tackle/v1.01
repomonkey
rma
robozilla
rogerbot
screaming frog seo spider
scrubby
searchpreview
sitesnagger
slurp
slurp/si
spankbot
spanner
spbot
suzuran
szukacz/1.4
technoratibot/8.1
teleport
teleportpro
telesoft
the intraformant
thenomad
tighttwatbot
tocrawl/urldispatcher
true_robot/1.0
true_robot
turingos
turnitinbot/1.5
turnitinbot
updownerbot
url control
url_spider_pro
urly warning
vci webviewer vci webviewer win32
vci
voilabot
web image collector
webauto
webbandit/3.50
webbandit
webcapture 2.0
webcopier v.2.2
webcopier v3.2a
webcopier
webenhancer
websauger
website quester
webster pro
webstripper
webzip/4.0
webzip/4.21
webzip/5.0
webzip
wget/1.5.3
wget/1.6
wget
wget
www-collector-e
xenu's link sleuth 1.1c
xenu's
yahoo-mmcrawler
yahoo-blogs/v3.9
zeus 32297 webster pro v2.9 win32
zeus link scout
zeus

Rule Path
Disallow /

speedyspider
asterias
twiceler
scoutjet
ia_archiver
yandexantivirus/2.0
yandexbot/3.0
yandeximageresizer/2.0
yandeximages/3.0
yandexmedia/3.0
yandexpagechecker/1.0
yandexwebmaster/2.0
yandexzakladki/3.0
bingbot
msnbot
twitterbot
mediapartners-google
adsbot-google
googlebot
googlebot-mobile
googlebot-image

Rule Path
Disallow /*?*v=
Disallow /*?*LinkSource=
Disallow /*?*Sector=
Disallow /*?*intcmp=
Disallow /logon
Disallow /newalert
Disallow /remindme
Disallow /profile
Disallow /session-img
Disallow /invalid-request
Disallow /document
Disallow /searchjobs
Disallow /en-gb/logon
Disallow /en-gb/newalert
Disallow /en-gb/remindme
Disallow /en-gb/profile
Disallow /en-gb/session-img
Disallow /en-gb/invalid-request
Disallow /en-gb/document
Disallow /en-gb/searchjobs
Disallow /jobsjson/
Disallow /nsj/jobsjson/
Disallow /nsj/logon/
Disallow /nsj/newalert/
Disallow /nsj/analytics/
Disallow /nsj/apply-profile/
Disallow /nsj/document/
Disallow /nsj/emailjob/
Disallow /nsj/external-redirect-registration/
Disallow /nsj/invalid-request/
Disallow /nsj/jbequicksignup/
Disallow /nsj/jobsrss/
Disallow /nsj/previewjob/
Disallow /nsj/searchjobs/
Disallow /nsj/session-img/
Disallow /nsj/your-jobs/
Disallow /nsj//profile/
Disallow /nsj/remindme/
Disallow /nsj/searchjobs/

*

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.newscientist.com/nsj/sitemapindex.xml

Comments

  • Robot exclusion file
  • The following pages require registration and login
  • badbots
  • goodbots
  • parameters
  • directories

Warnings

  • 1 invalid line.