contentpass.net
robots.txt

Robots Exclusion Standard data for contentpass.net

Resource Scan

Scan Details

Site Domain contentpass.net
Base Domain contentpass.net
Scan Status Ok
Last Scan2024-06-04T06:27:21+00:00
Next Scan 2024-06-18T06:27:21+00:00

Last Scan

Scanned2024-06-04T06:27:21+00:00
URL https://contentpass.net/robots.txt
Redirect https://www.contentpass.net/robots.txt
Redirect Domain www.contentpass.net
Redirect Base contentpass.net
Domain IPs 51.91.189.161
Redirect IPs 169.150.247.39, 2400:52e0:1e00::1079:1
Response IP 169.150.247.36
Found Yes
Hash c87833ea868bfb8c1a68bbc0cc188e5752aca24f7b0f01d71a12cda4d114bf2b
SimHash 299efc8141df

Groups

baiduspider

Rule Path
Disallow /

boitho.com-dc

Rule Path
Disallow /

busiverse

Rule Path
Disallow /

cazoodlebot

Rule Path
Disallow /

exabot

Rule Path
Disallow /

googlebot-image

Rule Path
Disallow /

heritrix

Rule Path
Disallow /

irlbot

Rule Path
Disallow /

jyxobot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

mirago-test-robot (http://www.miragorobot.com)

Rule Path
Disallow /

nimblecrawler

Rule Path
Disallow /

psbot

Rule Path
Disallow /

sirketce

Rule Path
Disallow /

seekbot

Rule Path
Disallow /

semanticdiscovery

Rule Path
Disallow /

sogou

Rule Path
Disallow /

soso

Rule Path
Disallow /

sosoimagespider

Rule Path
Disallow /

tineye

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

twiceler

Rule Path
Disallow /

voilabot

Rule Path
Disallow /

webalta

Rule Path
Disallow /

yahoo-mmcrawler

Rule Path
Disallow /

yodaobot

Rule Path
Disallow /

zermelo

Rule Path
Disallow /

*

Rule Path
Disallow
Disallow /dashboard/

Other Records

Field Value
crawl-delay 5

Comments

  • Disallow Baidu Bot (Japanese)
  • Disallow Boitho dc Bot (Norway)
  • Disallow Busiverse Bot (Turkey Sirketce/Busiverse )
  • Disallow CazoodleBot - from University of Illinois
  • Disallow Exabot Bot - Exalead
  • Disallow Google Image Bot
  • Disallow heritrix Bot - from Yell.Com
  • Disallow IRLbot - IRL Texas AM research bot
  • Disallow Jyxobot - Czech Webcrawler for Jyxo
  • Disallow Majestic12.co.uk
  • Disallow Mirago.com
  • Disallow NimbleCrawler (http://www.webmasterworld.com/forum93/858.htm)
  • Disallow psbot spidering of images and hub
  • Disallow Sirketce Bot (Turkey Sirketce/Busiverse )
  • Disallow Seekbot - http://www.seekport.co.uk/seekbot/
  • Disallow semanticdiscovery - from Southern Utah University (compyter Science Dept.)
  • Disallow Sogou - Chinese Search Engine
  • Disallow SoSo - Chinese Search Engine
  • Disallow SoSoImageSpider - Chinese picture Search Engine
  • Disallow TinEye - Image trawler Search Engine
  • Disallow TurnITin - "This robot collects content from the Internet for the sole purpose of helping educational institutions prevent plagiarism"
  • Disallow Twiceler - Cuill (also Barred IPs on firewall)
  • Disallow Voilabot Bot - France Telecom
  • Disallow WebAlta Bot - Russian
  • Disallow Yahoo Image Bot
  • Disallow YodaoBot - Chinese Search Engine
  • Disallow zermelo - Bot du Jour from Amazon - may need to block IP range
  • By default all bots are allowed with restriction to dashboard