catalog.upenn.edu
robots.txt

Robots Exclusion Standard data for catalog.upenn.edu

Resource Scan

Scan Details

Site Domain catalog.upenn.edu
Base Domain upenn.edu
Scan Status Ok
Last Scan2025-03-03T11:28:42+00:00
Next Scan 2025-04-02T11:28:42+00:00

Last Scan

Scanned2025-03-03T11:28:42+00:00
URL https://catalog.upenn.edu/robots.txt
Domain IPs 12.175.6.47
Response IP 12.175.6.47
Found Yes
Hash b38de2a45e2d9fb1ab2960d6a5e3532c75836edba218673f1d70a27c77942097
SimHash 880d5ce5318d

Groups

*

Rule Path
Disallow /archive/
Disallow /admin/
Disallow /azindex/
Disallow /catalogcontents/
Disallow /cim/
Disallow /clmail/
Disallow /courseadmin/
Disallow /courseleaf/
Disallow /css/
Disallow /dbleaf/
Disallow /depts/
Disallow /fonts/
Disallow /gallery/
Disallow /images/
Disallow /js/
Disallow /mig/
Disallow /migration/
Disallow /navbar/
Disallow /pagewiz/
Disallow /programadmin/
Disallow /responseform/
Disallow /ribbit/
Disallow /search/
Disallow /shared/
Disallow /styles/
Disallow /tmp/
Disallow /wiztest/
Disallow /xsearch/
Disallow /pdf/
Disallow /course-search/build
Disallow /course-search/api

Other Records

Field Value
sitemap http://catalog.upenn.edu/sitemap.xml