marxists.org
robots.txt

Robots Exclusion Standard data for marxists.org

Resource Scan

Scan Details

Site Domain marxists.org
Base Domain marxists.org
Scan Status Ok
Last Scan2025-11-30T12:03:20+00:00
Next Scan 2025-12-30T12:03:20+00:00

Last Scan

Scanned2025-11-30T12:03:20+00:00
URL https://marxists.org/robots.txt
Redirect https://www.marxists.org/robots.txt
Redirect Domain www.marxists.org
Redirect Base marxists.org
Domain IPs 2a01:4f9:3080:26d2::2, 65.109.101.238
Redirect IPs 2a01:4f9:3080:26d2::2, 65.109.101.238
Response IP 65.109.101.238
Found Yes
Hash 4cbfa3b2f00e888311bfcea0b58a2e5deff974cc6766a74197866ed47457c194
SimHash 206bb3e5ef90

Groups

*

Rule Path Comment
Disallow /admin/errors/ skip the errors directory
Disallow /admin/intro/history/webstats/ not reliable
Disallow /admin/intro/history/marxists/traffic not reliable
Disallow /admin/new-archives/ skip news archive directory
Disallow /archive/justo/ mirror
Disallow /chinese/update/ skip chinese news
Disallow /espanol/admin/ skip spanish admin directory
Disallow /espanol/justo/ mirror
Disallow /espanol/trotsky/ceip/ it's not linked to, only goofs up link check
Disallow /history/canada/socialisthistory/ mirror
Disallow /korean/trotsky/ bad links
Disallow /cgi-bin/ scripts
Disallow /admin/js/ scripts
Disallow /webstats/ not reliable

Other Records

Field Value
crawl-delay 1

w3c-checklink

Rule Path
Disallow

Comments

  • User-Agent: LinkChecker
  • Disallow: