manhattan.edu
robots.txt

Robots Exclusion Standard data for manhattan.edu

Resource Scan

Scan Details

Site Domain manhattan.edu
Base Domain manhattan.edu
Scan Status Ok
Last Scan2024-09-07T07:21:25+00:00
Next Scan 2024-10-07T07:21:25+00:00

Last Scan

Scanned2024-09-07T07:21:25+00:00
URL https://manhattan.edu/robots.txt
Domain IPs 35.222.12.159
Response IP 35.222.12.159
Found Yes
Hash 4f90334c0fc4d49e6879979d5a36ca5c4f2f9beffd48ceb5775e62af7781365b
SimHash 67050071c13f

Groups

*

Rule Path
Allow /_files/audio/
Allow /_files/css/
Allow /_files/images/
Allow /_files/js/
Disallow /_files/php/
Allow /_files/video/
Allow /_files/pdf-files/
Disallow /_documentation/
Disallow /training-test/
Disallow /events.php/
Disallow /academics/schools-and-departments/school-of-liberal-arts/*
Disallow /academics/schools-and-departments/school-of-health-professions/*
Disallow /academics/schools-and-departments/school-of-education-and-health/*
Disallow /academics/schools-and-departments/school-of-science/*

Other Records

Field Value
crawl-delay 5

Other Records

Field Value
sitemap https://manhattan.edu/sitemap.xml