cdl.sps.columbia.edu
robots.txt

Robots Exclusion Standard data for cdl.sps.columbia.edu

Resource Scan

Scan Details

Site Domain cdl.sps.columbia.edu
Base Domain columbia.edu
Scan Status Ok
Last Scan2025-08-24T04:52:58+00:00
Next Scan 2025-09-23T04:52:58+00:00

Last Scan

Scanned2025-08-24T04:52:58+00:00
URL https://cdl.sps.columbia.edu/robots.txt
Redirect https://careerdesignlab.sps.columbia.edu/robots.txt
Redirect Domain careerdesignlab.sps.columbia.edu
Redirect Base columbia.edu
Domain IPs 34.204.213.151, 44.196.168.43
Redirect IPs 34.204.213.151, 44.196.168.43
Response IP 34.204.213.151
Found Yes
Hash f6ee0d0076b5daff640cb0680765bdc32df2c86332356546d6fdc62890e3da70
SimHash 201dde520185

Groups

*

Rule Path
Disallow /admin/
Allow /admin/admin-ajax.php
Disallow /search/
Disallow /*.csv$
Disallow /*.CSV$
Disallow /*.doc$
Disallow /*.DOC$
Disallow /*.docx$
Disallow /*.DOCX$
Disallow /*.m4a$
Disallow /*.M4A$
Disallow /*.mp3$
Disallow /*.MP3$
Disallow /*.odt$
Disallow /*.ODT$
Disallow /*.ogg$
Disallow /*.OGG$
Disallow /*.pdf$
Disallow /*.PDF$
Disallow /*.ppt$
Disallow /*.PPT$
Disallow /*.pptx$
Disallow /*.PPTX$
Disallow /*.woff$
Disallow /*.WOFF$
Disallow /*.woff2$
Disallow /*.WOFF2$
Disallow /*.xls$
Disallow /*.XLS$
Disallow /*.xlsx$
Disallow /*.XLSX$
Disallow /*.zip$
Disallow /*.ZIP$