manhattan.edu
robots.txt
Robots Exclusion Standard data for manhattan.edu
Resource Scan
Scan Details
Site Domain | manhattan.edu |
Base Domain | manhattan.edu |
Scan Status | Ok |
Last Scan | 2024-09-07T07:21:25+00:00 |
Next Scan | 2024-10-07T07:21:25+00:00 |
Last Scan
Scanned | 2024-09-07T07:21:25+00:00 |
URL | https://manhattan.edu/robots.txt |
Domain IPs | 35.222.12.159 |
Response IP | 35.222.12.159 |
Found | Yes |
Hash | 4f90334c0fc4d49e6879979d5a36ca5c4f2f9beffd48ceb5775e62af7781365b |
SimHash | 67050071c13f |
Groups
*
Rule | Path |
---|---|
Allow | /_files/audio/ |
Allow | /_files/css/ |
Allow | /_files/images/ |
Allow | /_files/js/ |
Disallow | /_files/php/ |
Allow | /_files/video/ |
Allow | /_files/pdf-files/ |
Disallow | /_documentation/ |
Disallow | /training-test/ |
Disallow | /events.php/ |
Disallow | /academics/schools-and-departments/school-of-liberal-arts/* |
Disallow | /academics/schools-and-departments/school-of-health-professions/* |
Disallow | /academics/schools-and-departments/school-of-education-and-health/* |
Disallow | /academics/schools-and-departments/school-of-science/* |
Other Records
Field | Value |
---|---|
crawl-delay | 5 |
Other Records
Field | Value |
---|---|
sitemap | https://manhattan.edu/sitemap.xml |