direct.mit.edu
robots.txt

Robots Exclusion Standard data for direct.mit.edu

Resource Scan

Scan Details

Site Domain direct.mit.edu
Base Domain mit.edu
Scan Status Ok
Last Scan2025-08-20T12:48:55+00:00
Next Scan 2025-09-19T12:48:55+00:00

Last Scan

Scanned2025-08-20T12:48:55+00:00
URL https://direct.mit.edu/robots.txt
Domain IPs 104.18.12.179, 104.18.13.179, 2606:4700::6812:cb3, 2606:4700::6812:db3
Response IP 104.18.13.179
Found Yes
Hash e7fd5d70c8ce8f8eace1b901460c1a15a62c2df83b7c3c5fa2eed5d36ddf1ffe
SimHash 4c011535a533

Groups

*

Rule Path
Disallow /bin/
Disallow /config/
Disallow /app_data/
Disallow /dlls/
Disallow /App_Themes/
Disallow /signin.aspx
Disallow /searchresults.aspx
Disallow /advancedsearch.aspx
Disallow /DownloadFile/
Disallow /downloadimage.aspx
Disallow /Citation/Download
Disallow /Shibboleth.sso/
Disallow /SignInShibboleth.aspx
Disallow /lookup/external-ref
Disallow /user/logout
Disallow /cgi/pmidlookup
Disallow /Themes/Client/app/img/favicons/manifest.json
Disallow /lookup/google-scholar
Disallow /store/*
Allow /cassette.axd/stylesheet/
Allow /cassette.axd/script/
Allow /cassette.axd/manifest.xml
Disallow /*.axd
Disallow /specificToClient1/

Other Records

Field Value
sitemap https://direct.mit.edu/data/sitemap/Site_1/sitemap_books_site1.xml
sitemap https://direct.mit.edu/data/sitemap/Site_1000081/sitemap.xml
sitemap https://direct.mit.edu/data/sitemap/sitemap.xml

Warnings

  • 2 invalid lines.