my.contentpass.net
robots.txt

Robots Exclusion Standard data for my.contentpass.net

Resource Scan

Scan Details

Site Domain my.contentpass.net
Base Domain contentpass.net
Scan Status Ok
Last Scan2024-09-23T08:17:52+00:00
Next Scan 2024-10-07T08:17:52+00:00

Last Scan

Scanned2024-09-23T08:17:52+00:00
URL https://my.contentpass.net/robots.txt
Domain IPs 51.91.189.161
Response IP 51.91.189.161
Found Yes
Hash 577f0649e1abfe9a0b4acfe493278f23c83b5db3f850607419bb1f740f1f1450
SimHash 2196fc8147df

Groups

baiduspider

Rule Path
Disallow /

boitho.com-dc

Rule Path
Disallow /

busiverse

Rule Path
Disallow /

cazoodlebot

Rule Path
Disallow /

exabot

Rule Path
Disallow /

googlebot-image

Rule Path
Disallow /
Allow /favicon.ico

heritrix

Rule Path
Disallow /

irlbot

Rule Path
Disallow /

jyxobot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

mirago-test-robot (http://www.miragorobot.com)

Rule Path
Disallow /

nimblecrawler

Rule Path
Disallow /

psbot

Rule Path
Disallow /

sirketce

Rule Path
Disallow /

seekbot

Rule Path
Disallow /

semanticdiscovery

Rule Path
Disallow /

sogou

Rule Path
Disallow /

soso

Rule Path
Disallow /

sosoimagespider

Rule Path
Disallow /

tineye

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

twiceler

Rule Path
Disallow /

voilabot

Rule Path
Disallow /

webalta

Rule Path
Disallow /

yahoo-mmcrawler

Rule Path
Disallow /

yodaobot

Rule Path
Disallow /

zermelo

Rule Path
Disallow /

*

Rule Path
Disallow
Disallow /dashboard/

Other Records

Field Value
crawl-delay 5

Comments

  • Disallow Baidu Bot (Japanese)
  • Disallow Boitho dc Bot (Norway)
  • Disallow Busiverse Bot (Turkey Sirketce/Busiverse )
  • Disallow CazoodleBot - from University of Illinois
  • Disallow Exabot Bot - Exalead
  • Disallow Google Image Bot
  • Disallow heritrix Bot - from Yell.Com
  • Disallow IRLbot - IRL Texas AM research bot
  • Disallow Jyxobot - Czech Webcrawler for Jyxo
  • Disallow Majestic12.co.uk
  • Disallow Mirago.com
  • Disallow NimbleCrawler (http://www.webmasterworld.com/forum93/858.htm)
  • Disallow psbot spidering of images and hub
  • Disallow Sirketce Bot (Turkey Sirketce/Busiverse )
  • Disallow Seekbot - http://www.seekport.co.uk/seekbot/
  • Disallow semanticdiscovery - from Southern Utah University (compyter Science Dept.)
  • Disallow Sogou - Chinese Search Engine
  • Disallow SoSo - Chinese Search Engine
  • Disallow SoSoImageSpider - Chinese picture Search Engine
  • Disallow TinEye - Image trawler Search Engine
  • Disallow TurnITin - "This robot collects content from the Internet for the sole purpose of helping educational institutions prevent plagiarism"
  • Disallow Twiceler - Cuill (also Barred IPs on firewall)
  • Disallow Voilabot Bot - France Telecom
  • Disallow WebAlta Bot - Russian
  • Disallow Yahoo Image Bot
  • Disallow YodaoBot - Chinese Search Engine
  • Disallow zermelo - Bot du Jour from Amazon - may need to block IP range
  • By default all bots are allowed with restriction to dashboard