www.tc.columbia.edu
robots.txt

Robots Exclusion Standard data for www.tc.columbia.edu

Archived Snapshots

Resource Scan

Scan Details

Site Domain	www.tc.columbia.edu
Base Domain	columbia.edu
Scan Status	Ok
Last Scan	2025-10-15T09:34:03+00:00
Next Scan	2025-11-14T09:34:03+00:00

Last Scan

Scanned	2025-10-15T09:34:03+00:00
URL	https://www.tc.columbia.edu/robots.txt
Domain IPs	3.215.252.197, 44.205.42.207, 98.86.125.226
Response IP	98.86.125.226
Found	Yes
Hash	25982ad93a870b73b1b0688b2afa74bf20c0460e40516a2b871ffab056cf0c0a
SimHash	89b01c8d2792

Groups

googlebot

Rule	Path
Disallow	/i/a/*
Disallow	/*/includes
Disallow	/*/scripts
Disallow	/*/styles
Disallow	/pulled-content
Disallow	/*/pulled-content
Disallow	/*/embeds

Rule

Path

Disallow

/i/a/*

Disallow

/*/includes

Disallow

/*/scripts

Disallow

/*/styles

Disallow

/pulled-content

Disallow

/*/pulled-content

Disallow

/*/embeds

*

Rule	Path
Disallow	/i/a/*
Disallow	/*/includes
Disallow	/*/scripts
Disallow	/*/styles
Disallow	/pulled-content
Disallow	/*/pulled-content
Disallow	/*/embeds

Rule

Path

Disallow

/i/a/*

Disallow

/*/includes

Disallow

/*/scripts

Disallow

/*/styles

Disallow

/pulled-content

Disallow

/*/pulled-content

Disallow

/*/embeds

Back to top

Comments

Robots.txt file

Back to top

www.tc.columbia.edurobots.txt

Resource Scan

Scan Details

Last Scan

Groups

googlebot

*

Comments

www.tc.columbia.edu
robots.txt