lists.gnu.org
robots.txt

Robots Exclusion Standard data for lists.gnu.org

Resource Scan

Scan Details

Site Domain lists.gnu.org
Base Domain gnu.org
Scan Status Ok
Last Scan2025-03-03T18:47:04+00:00
Next Scan 2025-04-02T18:47:04+00:00

Last Scan

Scanned2025-03-03T18:47:04+00:00
URL https://lists.gnu.org/robots.txt
Domain IPs 2001:470:142::17, 209.51.188.17
Response IP 209.51.188.17
Found Yes
Hash f11604d72816711003e00f6293d8f50f3c3bc24132c99e9abc96040eb69254ae
SimHash 101cd143cfea

Groups

*

Rule Path
Disallow /archive/html/www-commits/

Other Records

Field Value
crawl-delay 4

amazonbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

imagesiftbot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

dataforseobot

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

barkrowler

Rule Path
Disallow /

screaming frog seo spider

Rule Path
Disallow /

zoombot

Rule Path
Disallow /

magpie-crawler

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

rogerbot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

siteauditbot

Rule Path
Disallow /

semrushbot-ba

Rule Path
Disallow /

semrushbot-si

Rule Path
Disallow /

semrushbot-swa

Rule Path
Disallow /

splitsignalbot

Rule Path
Disallow /

semrushbot-ocob

Rule Path
Disallow /

jamesbot

Rule Path
Disallow /

oncrawl

Rule Path
Disallow /

awariorssbot

Rule Path
Disallow /

awariosmartbot

Rule Path
Disallow /

awariobot

Rule Path
Disallow /

serpstatbot

Rule Path
Disallow /

netestate ne crawler

Rule Path
Disallow /

panscient.com

Rule Path
Disallow /

academicbotrtu

Rule Path
Disallow /

Comments

  • RT #1942639 avoid non-canonical & drafts in search results
  • Majestic - SEO
  • DataForSeo - SEO
  • webmeup - SEO
  • Ahrefs - SEO
  • babbar - SEO
  • Screamingfrog - SEO
  • Seozoom - SEO
  • Brandwatch - SEO
  • Begin Moz - SEO
  • Not to be confused with Mozilla.
  • End Moz - SEO
  • Begin Semrush - SEO
  • End Semrush - SEO
  • cognitiveSEO - SEO
  • oncrawl - SEO
  • BEGIN Awario - Marketing
  • END Awario - Marketing
  • SERPSTAT - SEO
  • website-datenbank.de - Search engine?
  • Ignores crawl-delay and does not help us.
  • Aggressive Latvian Academic Integrity bot that does not help us.