marxists.info
robots.txt

Robots Exclusion Standard data for marxists.info

Resource Scan

Scan Details

Site Domain marxists.info
Base Domain marxists.info
Scan Status Ok
Last Scan2025-11-29T00:50:14+00:00
Next Scan 2025-12-29T00:50:14+00:00

Last Scan

Scanned2025-11-29T00:50:14+00:00
URL https://www.marxists.info/robots.txt
Domain IPs 104.37.173.99
Response IP 104.37.173.99
Found Yes
Hash 4cbfa3b2f00e888311bfcea0b58a2e5deff974cc6766a74197866ed47457c194
SimHash 206bb3e5ef90

Groups

*

Rule Path Comment
Disallow /admin/errors/ skip the errors directory
Disallow /admin/intro/history/webstats/ not reliable
Disallow /admin/intro/history/marxists/traffic not reliable
Disallow /admin/new-archives/ skip news archive directory
Disallow /archive/justo/ mirror
Disallow /chinese/update/ skip chinese news
Disallow /espanol/admin/ skip spanish admin directory
Disallow /espanol/justo/ mirror
Disallow /espanol/trotsky/ceip/ it's not linked to, only goofs up link check
Disallow /history/canada/socialisthistory/ mirror
Disallow /korean/trotsky/ bad links
Disallow /cgi-bin/ scripts
Disallow /admin/js/ scripts
Disallow /webstats/ not reliable

Other Records

Field Value
crawl-delay 1

w3c-checklink

Rule Path
Disallow

Comments

  • User-Agent: LinkChecker
  • Disallow: