perseus.org
robots.txt

Robots Exclusion Standard data for perseus.org

Resource Scan

Scan Details

Site Domain perseus.org
Base Domain perseus.org
Scan Status Ok
Last Scan2024-05-15T09:46:47+00:00
Next Scan 2024-06-14T09:46:47+00:00

Last Scan

Scanned2024-05-15T09:46:47+00:00
URL http://www.perseus.org/robots.txt
Redirect http://www.perseus.tufts.edu/robots.txt
Redirect Domain www.perseus.tufts.edu
Redirect Base tufts.edu
Redirect IPs 130.64.212.105
Response IP 130.64.212.105
Found Yes
Hash 1c54f3dd6c603ffa067f5749f5953be2eb39fa3b119299361b587a0480d72c23
SimHash 350462e1d586

Groups

*

Rule Path
Disallow /manual/
Disallow /manual-1.3/
Disallow /manual-2.0/
Disallow /manual-2.2/
Disallow /addon-modules/
Disallow /doc/
Disallow /images/
Disallow /hopper/searchresults
Disallow /hopper/morph
Disallow /cgi-bin/
Disallow /Help/
Disallow /cache/
Disallow /Articles/
Disallow /PR/
Disallow /Texts/
Disallow /index.html
Disallow /Texts.html
Disallow /art%26arch.html
Disallow /PerseusInfo.html
Disallow /startingPoints.html
Disallow /searches.html
Disallow /lexica.html
Disallow /newlatin.html
Disallow /copyright.html
Disallow /admin/

stress-agent

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

pcore-http/v0.25.0

Rule Path
Disallow /

pcore-http

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

freefind

Rule Path
Disallow /

Comments

  • exclude help system from robots
  • exclude search results
  • exclude morph
  • exclude old version pages
  • but allow htdig to index our doc-tree
  • User-agent: htdig
  • Disallow:
  • disallow stress test
  • Baiduspider
  • Disallow Pcore-HTTP
  • Disallow Pcore-HTTP
  • User-agent: Yahoo! Slurp
  • Disallow: /
  • Disallow Bing -mmacdo02 3-20-2018
  • User-agent: Bingbot
  • Disallow: /
  • Disallow Googlebot
  • User-agent: Googlebot
  • Disallow: /