ag.purdue.edu
robots.txt

Robots Exclusion Standard data for ag.purdue.edu

Resource Scan

Scan Details

Site Domain ag.purdue.edu
Base Domain purdue.edu
Scan Status Ok
Last Scan2025-07-31T10:33:19+00:00
Next Scan 2025-08-30T10:33:19+00:00

Last Scan

Scanned2025-07-31T10:33:19+00:00
URL https://ag.purdue.edu/robots.txt
Domain IPs 128.210.7.147
Response IP 128.210.7.147
Found Yes
Hash 2d1931a335f785a1ff064266511a83f10e299bf513634804335198eaf1fa9308
SimHash 31408968ddd3

Groups

*

Rule Path
Allow /
Disallow /directory/department/
Disallow /testing/
Disallow /news/testing/
Disallow /events/testing/

Other Records

Field Value
sitemap https://ag.purdue.edu/ag-sitemap.xml
sitemap https://ag.purdue.edu/events/ag-sitemap.xml
sitemap https://ag.purdue.edu/news/ag-sitemap.xml
sitemap https://ag.purdue.edu/coa-pdf-sitemap.xml
sitemap https://ag.purdue.edu/directory/ag-coa-directory-list-sitemap.xml

Comments

  • Site Maps
  • Disallow directory sub-folder crawls
  • Disallow for live site testing