stolaf.edu
robots.txt

Robots Exclusion Standard data for stolaf.edu

Resource Scan

Scan Details

Site Domain stolaf.edu
Base Domain stolaf.edu
Scan Status Ok
Last Scan2024-10-31T00:30:53+00:00
Next Scan 2024-11-30T00:30:53+00:00

Last Scan

Scanned2024-10-31T00:30:53+00:00
URL https://stolaf.edu/robots.txt
Redirect https://wp.stolaf.edu/robots.txt
Redirect Domain wp.stolaf.edu
Redirect Base stolaf.edu
Domain IPs 199.91.183.19
Redirect IPs 199.91.183.19, 2620:132:b020:20::19
Response IP 199.91.183.19
Found Yes
Hash a350ba127b0f66959b638bf9d590b2c9a06ddb7a9dc98d957a8b7800ddf21158
SimHash 2b35d762e309

Groups

*

Rule Path
Disallow /cgi-bin/
Disallow /bin/
Disallow /aliases/
Disallow /src/
Disallow /tmp/
Disallow /files/
Disallow /naha/
Disallow /orgs/list/
Disallow /classhomepages/
Disallow /it/classroom-database/
Disallow /newstudents/track-your-progress/
Disallow /catalog/*/admissions/
Disallow /apps/olecard/checkbalance/
Disallow /apps/fram

mj12bot

Rule Path
Disallow /

petalbot

Rule Path
Disallow /

aspiegelbot

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

mauibot

Rule Path
Disallow /

Comments

  • this file is managed with ansible
  • per https://simtechdev.com/blog/good-and-bad-bots-to-control-to-save-server-resources-and-improve-performance/