www.tc.columbia.edu
robots.txt

Robots Exclusion Standard data for www.tc.columbia.edu

Resource Scan

Scan Details

Site Domain www.tc.columbia.edu
Base Domain columbia.edu
Scan Status Ok
Last Scan2025-10-15T09:34:03+00:00
Next Scan 2025-11-14T09:34:03+00:00

Last Scan

Scanned2025-10-15T09:34:03+00:00
URL https://www.tc.columbia.edu/robots.txt
Domain IPs 3.215.252.197, 44.205.42.207, 98.86.125.226
Response IP 98.86.125.226
Found Yes
Hash 25982ad93a870b73b1b0688b2afa74bf20c0460e40516a2b871ffab056cf0c0a
SimHash 89b01c8d2792

Groups

googlebot

Rule Path
Disallow /i/a/*
Disallow /*/includes
Disallow /*/scripts
Disallow /*/styles
Disallow /pulled-content
Disallow /*/pulled-content
Disallow /*/embeds

*

Rule Path
Disallow /i/a/*
Disallow /*/includes
Disallow /*/scripts
Disallow /*/styles
Disallow /pulled-content
Disallow /*/pulled-content
Disallow /*/embeds

Comments

  • Robots.txt file