juniata.edu
robots.txt

Robots Exclusion Standard data for juniata.edu

Resource Scan

Scan Details

Site Domain juniata.edu
Base Domain juniata.edu
Scan Status Ok
Last Scan2024-06-22T04:09:32+00:00
Next Scan 2024-07-22T04:09:32+00:00

Last Scan

Scanned2024-06-22T04:09:32+00:00
URL https://juniata.edu/robots.txt
Redirect https://www.juniata.edu/robots.txt
Redirect Domain www.juniata.edu
Redirect Base juniata.edu
Domain IPs 104.120.138.144, 23.212.251.16
Redirect IPs 104.81.138.73, 104.81.138.89, 2600:1413:a000::1734:2831, 2600:1413:a000::1734:2840
Response IP 23.32.29.105
Found Yes
Hash 78481972ab9d3174fe54c62ad570fb504583ca031fc5d35c2def8cba1679c9f0
SimHash 9818950ac1f0

Groups

*

Rule Path
Disallow /_ux-testing/
Disallow /_training/
Disallow /_resources/
Disallow /error_pages/
Disallow /initial/
Disallow /ldp-images/
Disallow /ads/
Disallow /webdev/
Disallow /dev/
Disallow /services/
Disallow *.csi
Disallow /html

discobot

Rule Path
Disallow /

Comments

  • This file can be used to affect how search engines and other web site crawlers see your site.
  • For more information, please see http://www.w3.org/TR/html4/appendix/notes.html#h-B.4.1.1
  • WebMatrix 2.0