cygwin.org
robots.txt

Robots Exclusion Standard data for cygwin.org

Resource Scan

Scan Details

Site Domain cygwin.org
Base Domain cygwin.org
Scan Status Ok
Last Scan2025-10-24T23:40:39+00:00
Next Scan 2025-11-23T23:40:39+00:00

Last Scan

Scanned2025-10-24T23:40:39+00:00
URL https://cygwin.org/robots.txt
Domain IPs 8.43.85.97
Response IP 8.43.85.97
Found Yes
Hash 32d5854bb53929e4db44541990af050b0b896f889be861503276c02c4e82904e
SimHash a8795bf0d0e2

Groups

scooter

Rule Path
Disallow /ml/

test crawler

Rule Path
Disallow /

digext

Rule Path
Disallow /

cw crawler

Rule Path
Disallow /

sosospider

Rule Path
Disallow /

googlebot

Rule Path
Disallow donations.html

*

Rule Path
Disallow /bugzilla/
Disallow /cgi-bin/
Disallow /cgi/
Disallow /cgi2-bin/
Disallow /ml/overseers/
Disallow /packages/
Disallow /snapshots/
Disallow /viewcvs/
Disallow /viewvc/
Disallow /git/
Disallow /git-cygwin-packages/
Disallow /cgit/
Disallow setup-x86.exe
Disallow setup-x86_64.exe
Disallow mirrors.lst

Other Records

Field Value
crawl-delay 60

Comments

  • contact sourcemaster@sourceware.org for questions.
  • see http://www.robotstxt.org/
  • for information about the file format.