cornell.edu
robots.txt

Robots Exclusion Standard data for cornell.edu

Resource Scan

Scan Details

Site Domain cornell.edu
Base Domain cornell.edu
Scan Status Ok
Last Scan2024-04-22T21:26:46+00:00
Next Scan 2024-05-22T21:26:46+00:00

Last Scan

Scanned2024-04-22T21:26:46+00:00
URL https://cornell.edu/robots.txt
Redirect https://www.cornell.edu/robots.txt
Redirect Domain www.cornell.edu
Redirect Base cornell.edu
Domain IPs 128.253.173.241, 128.253.173.242, 128.253.173.243, 128.253.173.244, 128.253.173.245, 128.253.173.246
Redirect IPs 13.107.213.59, 13.107.246.59, 2620:1ec:46::59, 2620:1ec:bdf::59
Response IP 13.107.213.59
Found Yes
Hash baada4216bf732573f67f021a8ee420f1f8b53d202d0fa546913556ea39298ee
SimHash d918db72c4b8

Groups

*

Rule Path
Disallow /_dynamic_files/
Disallow /_tasks/
Disallow /test/
Disallow /tools/
Disallow /template/
Disallow /search/
Disallow /visit/plan/
Disallow /video/kaltura/
Disallow /video/tasks/
Disallow /server-health-check/

Other Records

Field Value
crawl-delay 6

mozilla/5.0 (compatible; msie 10.0; windows nt 6.1; trident/6.0) sitecheck-sitecrawl by siteimprove.com

Rule Path
Disallow /cuinfo/specialconditions/
Disallow /_includes/header.cfm

mozilla/5.0 (compatible; msie 10.0; windows nt 6.1; trident/6.0) linkcheck by siteimprove.com

Rule Path
Disallow /cuinfo/specialconditions/
Disallow /_includes/header.cfm

html validator: siteimprove_w3c_validator/1.3

Rule Path
Disallow /cuinfo/specialconditions/
Disallow /_includes/header.cfm

css validator: jigsaw/2.3.0 w3c_css_validator_jfouffa/2.0

Rule Path
Disallow /cuinfo/specialconditions/
Disallow /_includes/header.cfm

Comments

  • SiteImprove should ignore these page particularly because they aren't actually used, but are still linked for historical reasons