sandiego.edu
robots.txt

Robots Exclusion Standard data for sandiego.edu

Resource Scan

Scan Details

Site Domain sandiego.edu
Base Domain sandiego.edu
Scan Status Ok
Last Scan2024-10-27T09:37:17+00:00
Next Scan 2024-11-26T09:37:17+00:00

Last Scan

Scanned2024-10-27T09:37:17+00:00
URL https://sandiego.edu/robots.txt
Redirect https://www.sandiego.edu/robots.txt
Redirect Domain www.sandiego.edu
Redirect Base sandiego.edu
Domain IPs 192.195.155.200
Redirect IPs 192.195.155.200
Response IP 192.195.155.200
Found Yes
Hash 53ebdb90f21d2801016cdfc55e0d5bd5aedb8f2f55a4d41f95f6e90a918d3dc1
SimHash e0faca4140f1

Groups

*

Rule Path
Disallow /onward/documents/emails/
Disallow /giving/honorroll/
Disallow /president/documents/emails/
Disallow /communications/email/
Disallow /emails/
Disallow /Connections/
Disallow /sandbox/
Disallow /events/display*
Disallow /law/manage/
Disallow /_cascade/
Disallow /soles/newsletters/
Disallow /soles/character-development-center/cdc-essays-2016/
Disallow /publications/newsletters/
Disallow /recentuploads/56441/
Disallow /uploads/56441/
Disallow /publications/newsletters/
Disallow /search/classes/
Disallow /ur/
Disallow /vista/archive_article.php?article_id=2006100511
Disallow /vista/archives/
Disallow /vista/documents/frontpage43016.pdf
Disallow /vista/documents/frontpage43015.pdf
Disallow /documents/vista/frontpage43016.pdf
Disallow /documents/vista/frontpage43015.pdf
Disallow /manage/
Disallow /manage20/
Disallow */download/
Disallow /cas/templates/
Disallow /templates/
Disallow /torerolife/templates/
Disallow /uploads/
Allow /uploads/*.jpg$
Allow /uploads/*.png$
Allow /uploads/*.gif$
Disallow /recentuploads/
Allow /recentuploads/*.jpg$
Allow /recentuploads/*.png$
Allow /recentuploads/*.gif$
Disallow /its/alerts/calendar.php?*
Disallow /*_sort
Disallow */MMWIP/
Disallow */_mm/
Disallow */_mmServerScripts/
Disallow */_notes/
Disallow /nrotc/webportal/
Disallow /cas/institute-for-civil-civic-engagement/
Disallow /cas/documents/faculty-resources/
Disallow /webdev/
Disallow /~*
Disallow /maps

Other Records

Field Value
crawl-delay 8

usdsearchblox

Rule Path
Allow /webdev/

Other Records

Field Value
sitemap https://www.sandiego.edu/admission-and-aid/sitemap.xml
sitemap https://www.sandiego.edu/business/sitemap.xml
sitemap https://www.sandiego.edu/cas/sitemap.xml
sitemap https://www.sandiego.edu/engineering/sitemap.xml
sitemap https://www.sandiego.edu/law/sitemap.xml
sitemap https://www.sandiego.edu/nursing/sitemap.xml
sitemap https://www.sandiego.edu/peace/sitemap.xml
sitemap https://www.sandiego.edu/soles/sitemap.xml
sitemap https://www.sandiego.edu/sitemap.xml
sitemap https://www.sandiego.edu/its/sitemap.xml

Comments

  • Temporary Disallow of UC areas causing issues
  • College and Schools
  • School of Leadership and Education Sciences
  • Areas with extremely old and archived materials that are no longer used
  • /ur/ - Marketing playground
  • The Vista never takes down an article
  • Manage Areas
  • Download Areas
  • /templates/ folders are worthless to the outside world
  • Because some people upload things that embarrass us
  • Because we still need the images for the News center.
  • It's just a symlink, so needs to be duplicated.
  • Generates infinite links
  • This is non-standard and probably applies to folder listings to keep our own search appliance from going crazy
  • Adobe Contribute/Dreamweaver Folders
  • MMWIP folders
  • _mm folders
  • _mmServerScripts
  • _notes
  • Specific customer requests
  • /webdev/ block from external crawl