gcc.gnu.org
robots.txt

Robots Exclusion Standard data for gcc.gnu.org

Resource Scan

Scan Details

Site Domain gcc.gnu.org
Base Domain gnu.org
Scan Status Ok
Last Scan2025-02-15T21:41:37+00:00
Next Scan 2025-03-17T21:41:37+00:00

Last Scan

Scanned2025-02-15T21:41:37+00:00
URL https://gcc.gnu.org/robots.txt
Domain IPs 2620:52:3:1:0:246e:9693:128c, 8.43.85.97
Response IP 8.43.85.97
Found Yes
Hash 1053a6a20482fa5864afbaa1326a07d0274ee0d76bf4f26347c25af3146eb617
SimHash 21a37fd4dadc

Groups

*

Rule Path
Disallow /viewvc/
Disallow /viewcvs
Disallow /git/
Disallow /cgit/
Disallow /svn
Disallow /cgi-bin/
Disallow /bugzilla/buglist.cgi
Disallow /bugzilla/show_bug.cgi*ctype%3Dxml*
Disallow /bugzilla/attachment.cgi
Disallow /bugzilla/showdependencygraph.cgi
Disallow /bugzilla/showdependencytree.cgi
Disallow /wiki/*?action=*
Disallow /wiki/*?diffs=*
Disallow /wiki/*?highlight=*
Disallow /wiki/*?calparms=*

Other Records

Field Value
crawl-delay 60

Comments

  • See http://www.robotstxt.org/wc/robots.html
  • for information about the file format.
  • Contact gcc@gcc.gnu.org for questions.