gettysburg.edu
robots.txt

Robots Exclusion Standard data for gettysburg.edu

Resource Scan

Scan Details

Site Domain gettysburg.edu
Base Domain gettysburg.edu
Scan Status Ok
Last Scan2024-10-02T03:00:57+00:00
Next Scan 2024-11-01T03:00:57+00:00

Last Scan

Scanned2024-10-02T03:00:57+00:00
URL https://www.gettysburg.edu/robots.txt
Domain IPs 138.234.4.66
Response IP 138.234.4.66
Found Yes
Hash 8bd89c7ef702ea4af961de0da2224477dfca5539a1a252657b93aa220d23608d
SimHash ba0f5150ce74

Groups

*

Rule Path
Disallow /current_students/learning_management/index.dot
Disallow /home/search*
Disallow /moodle/index.dot
Disallow /offices/center-for-career-engagement/secure/*
Disallow /offices/center-for-global-education/faculty/resident-directorship/
Disallow /offices/college-life/care/secure/*
Disallow /offices/diversity-inclusion/campus-climate-study/protected/*
Disallow /offices/environmental-health-safety/occupational-safety/secure/*
Disallow /offices/finance-administration/pdfs/secure/*
Disallow /offices/financial-services/accounts-payable/password-protected/*
Disallow /offices/institutional-analysis/password-protected/*
Disallow /offices/johnson-center-for-creative-teaching-and-learning/teaching-mentoring-resources/secure/*
Disallow /offices/public-safety/secure/*
Disallow /s/cart.json
Disallow /s/click-history.json
Disallow /s/redirect*
Disallow /s/search-history.json
Disallow /search*

rogerbot

Rule Path
Disallow

Other Records

Field Value
sitemap https://www.gettysburg.edu/sitemap.xml