tc.columbia.edu
robots.txt

Robots Exclusion Standard data for tc.columbia.edu

Resource Scan

Scan Details

Site Domain tc.columbia.edu
Base Domain columbia.edu
Scan Status Ok
Last Scan2025-10-05T23:30:56+00:00
Next Scan 2025-11-04T23:30:56+00:00

Last Scan

Scanned2025-10-05T23:30:56+00:00
URL http://tc.columbia.edu/robots.txt
Redirect https://www.tc.columbia.edu/robots.txt
Redirect Domain www.tc.columbia.edu
Redirect Base columbia.edu
Domain IPs 34.196.195.67
Redirect IPs 34.237.188.241, 52.73.63.111, 54.92.134.63
Response IP 54.92.134.63
Found Yes
Hash 25982ad93a870b73b1b0688b2afa74bf20c0460e40516a2b871ffab056cf0c0a
SimHash 89b01c8d2792

Groups

googlebot

Rule Path
Disallow /i/a/*
Disallow /*/includes
Disallow /*/scripts
Disallow /*/styles
Disallow /pulled-content
Disallow /*/pulled-content
Disallow /*/embeds

*

Rule Path
Disallow /i/a/*
Disallow /*/includes
Disallow /*/scripts
Disallow /*/styles
Disallow /pulled-content
Disallow /*/pulled-content
Disallow /*/embeds

Comments

  • Robots.txt file