pcc.edu
robots.txt

Robots Exclusion Standard data for pcc.edu

Resource Scan

Scan Details

Site Domain pcc.edu
Base Domain pcc.edu
Scan Status Ok
Last Scan2024-10-21T05:29:09+00:00
Next Scan 2024-11-20T05:29:09+00:00

Last Scan

Scanned2024-10-21T05:29:09+00:00
URL https://pcc.edu/robots.txt
Redirect https://www.pcc.edu/robots.txt
Redirect Domain www.pcc.edu
Redirect Base pcc.edu
Domain IPs 209.152.46.213
Redirect IPs 209.152.46.213
Response IP 209.152.46.213
Found Yes
Hash a2ba55024a5206e84b5c11f0b30d28e770abc8cc761529ece23603dc879d0fc9
SimHash 3520dc8b0d30

Groups

*

Rule Path
Disallow /_mm/
Disallow /_notes/
Disallow /_baks/
Disallow /about/profiles/
Disallow /catalog/course-information/
Disallow /ask-the-panther/
Disallow /signin/
Disallow /MMWIP/
Disallow /scripts/webevent.pl*
Disallow /schedule/*?fa=doquery*
Disallow /schedule/*?fa=doadvquery*
Disallow /schedule/*?fa=queryForm*
Disallow /schedule/*?er=*
Disallow /schedule/planning-guide/
Allow /schedule/planning-guide/$

test

Rule Path
Disallow /_*
Disallow /scripts/webevent.pl*
Disallow /search/
Disallow */xmlrpc.php
Disallow *.js
Disallow *.json
Disallow *.css
Disallow *.xml