discover.psu.edu
robots.txt

Robots Exclusion Standard data for discover.psu.edu

Resource Scan

Scan Details

Site Domain discover.psu.edu
Base Domain psu.edu
Scan Status Ok
Last Scan2025-07-27T21:37:32+00:00
Next Scan 2025-08-10T21:37:32+00:00

Last Scan

Scanned2025-07-27T21:37:32+00:00
URL https://discover.psu.edu/robots.txt
Domain IPs 13.68.101.62
Response IP 13.68.101.62
Found Yes
Hash 3f8974b6f026f3a0e1bacd04b1ea4a81cefa68e6300cfa8fbf1be13b3338427e
SimHash 7d14dc34e953

Groups

*

Rule Path
Disallow /notfound
Disallow /forbidden
Disallow /error
Disallow /api/
Disallow /engage/

Other Records

Field Value
sitemap https://discover.psu.edu/sitemap.xml