balisage.net
robots.txt

Robots Exclusion Standard data for balisage.net

Resource Scan

Scan Details

Site Domain balisage.net
Base Domain balisage.net
Scan Status Ok
Last Scan2025-11-20T06:22:59+00:00
Next Scan 2025-12-20T06:22:59+00:00

Last Scan

Scanned2025-11-20T06:22:59+00:00
URL https://balisage.net/robots.txt
Domain IPs 50.87.140.133
Response IP 50.87.140.133
Found Yes
Hash 04d608dd4808f354f30fa9877dfbf33f4070671e1b89c7957901675b7d54a44c
SimHash 3494516e87fc

Groups

*

Rule Path
Disallow /peer/
Disallow /regforms/
Disallow /Drafts/
Disallow /eval/
Disallow /temp/

Comments

  • This file sets out restrictions that most spiders and automatic
  • web-indexers voluntarily abide by. For more information, check out:
  • http://info.webcrawler.com/mak/projects/robots/norobots.html
  • Stay away from these:
  • Add more disallow lines for specific files that you do not want automated
  • web-searching tools to access, such as a membership list, or temporary
  • files that you intend to remove or rename quickly. For example:
  • Disallow: /year-end-clearance.html