law.harvard.edu
robots.txt

Robots Exclusion Standard data for law.harvard.edu

Resource Scan

Scan Details

Site Domain law.harvard.edu
Base Domain harvard.edu
Scan Status Ok
Last Scan2024-08-29T13:07:16+00:00
Next Scan 2024-09-28T13:07:16+00:00

Last Scan

Scanned2024-08-29T13:07:16+00:00
URL https://law.harvard.edu/robots.txt
Redirect http://law.harvard.edu/robots.txt
Domain IPs 140.247.200.140
Response IP 140.247.200.140
Found Yes
Hash 032bceabdcbc6561c235e951347ceb69cbc219486f16e4ee45ef25260fb6f6d0
SimHash 412a38e82711

Groups

*

Rule Path
Disallow /cgi-bin
Disallow /perl
Disallow /webevent.cgi
Disallow /events
Disallow /incl
Disallow /bb/
Disallow /xml/ns/
Disallow /login/
Disallow /students/dean/resources/
Disallow /srv/www/gapps