www.tc.columbia.edu
robots.txt
            Robots Exclusion Standard data for www.tc.columbia.edu
Resource Scan
Scan Details
| Site Domain | www.tc.columbia.edu | 
| Base Domain | columbia.edu | 
| Scan Status | Ok | 
| Last Scan | 2025-10-15T09:34:03+00:00 | 
| Next Scan | 2025-11-14T09:34:03+00:00 | 
Last Scan
| Scanned | 2025-10-15T09:34:03+00:00 | 
| URL | https://www.tc.columbia.edu/robots.txt | 
| Domain IPs | 3.215.252.197, 44.205.42.207, 98.86.125.226 | 
| Response IP | 98.86.125.226 | 
| Found | Yes | 
| Hash | 25982ad93a870b73b1b0688b2afa74bf20c0460e40516a2b871ffab056cf0c0a | 
| SimHash | 89b01c8d2792 | 
Groups
googlebot
          | Rule | Path | 
|---|---|
| Disallow | /i/a/* | 
| Disallow | /*/includes | 
| Disallow | /*/scripts | 
| Disallow | /*/styles | 
| Disallow | /pulled-content | 
| Disallow | /*/pulled-content | 
| Disallow | /*/embeds | 
*
          | Rule | Path | 
|---|---|
| Disallow | /i/a/* | 
| Disallow | /*/includes | 
| Disallow | /*/scripts | 
| Disallow | /*/styles | 
| Disallow | /pulled-content | 
| Disallow | /*/pulled-content | 
| Disallow | /*/embeds | 
Comments