discover.psu.edu
robots.txt
Robots Exclusion Standard data for discover.psu.edu
Resource Scan
Scan Details
Site Domain | discover.psu.edu |
Base Domain | psu.edu |
Scan Status | Ok |
Last Scan | 2025-07-27T21:37:32+00:00 |
Next Scan | 2025-08-10T21:37:32+00:00 |
Last Scan
Scanned | 2025-07-27T21:37:32+00:00 |
URL | https://discover.psu.edu/robots.txt |
Domain IPs | 13.68.101.62 |
Response IP | 13.68.101.62 |
Found | Yes |
Hash | 3f8974b6f026f3a0e1bacd04b1ea4a81cefa68e6300cfa8fbf1be13b3338427e |
SimHash | 7d14dc34e953 |
Groups
*
Rule | Path |
---|---|
Disallow | /notfound |
Disallow | /forbidden |
Disallow | /error |
Disallow | /api/ |
Disallow | /engage/ |
Other Records
Field | Value |
---|---|
sitemap | https://discover.psu.edu/sitemap.xml |