cyber.harvard.edu
robots.txt

Robots Exclusion Standard data for cyber.harvard.edu

Resource Scan

Scan Details

Site Domain cyber.harvard.edu
Base Domain harvard.edu
Scan Status Ok
Last Scan2024-06-09T22:03:25+00:00
Next Scan 2024-07-09T22:03:25+00:00

Last Scan

Scanned2024-06-09T22:03:25+00:00
URL https://cyber.harvard.edu/robots.txt
Domain IPs 128.103.64.74
Response IP 128.103.64.74
Found Yes
Hash b22208cdfac4bd57e99250c8424ab072d4366919df34e86f2b0973f8c7f12f09
SimHash 2f0880378fdb

Groups

*

Rule Path
Disallow /team
Disallow /lists
Disallow /msdoj/discuss/
Disallow /zittrain/
Disallow /cite/
Disallow /opengovernment
Disallow /blogs

*

Rule Path
Disallow /blogsupport
Disallow /brooklaw
Disallow /cyberlaw2005/wiki
Disallow /cyberone/wiki
Disallow /h2owiki
Disallow /iptheory
Disallow /jamaicavoices
Disallow /netizenship
Disallow /ocs_global
Disallow /ocs_intranet
Disallow /oni-RAs
Disallow /practical_lawyering
Disallow /publicmediaforge
Disallow /techwiki

htdig/3.1.5 (wendy@eon.law.harvard.edu)

Rule Path
Disallow
Disallow /msdoj/discuss/

Comments

  • Spiders + MediaWiki = Bad
  • allow cyber-search to get there