asianwiki.com
robots.txt

Robots Exclusion Standard data for asianwiki.com

Resource Scan

Scan Details

Site Domain asianwiki.com
Base Domain asianwiki.com
Scan Status Failed
Failure StageFetching resource.
Failure ReasonServer returned a client error.
Last Scan2024-10-22T22:03:27+00:00
Next Scan 2024-11-21T22:03:27+00:00

Last Successful Scan

Scanned2024-09-23T22:02:30+00:00
URL https://asianwiki.com/robots.txt
Domain IPs 104.26.8.64, 104.26.9.64, 172.67.74.142
Response IP 104.26.9.64
Found Yes
Hash d8e88efa2bbdbcfc709c99b6c6dd652f29aa82cd6c08f20bfff28d8158012ed4
SimHash a21231ebeef6

Groups

*

Rule Path
Disallow /adsense/
Disallow /APCAW2012/
Disallow /cache/
Disallow /commentsdirectory/
Disallow /webstats/

sitecheck.internetseer.com

Rule Path
Disallow /

zealbot

Rule Path
Disallow /

msiecrawler

Rule Path
Disallow /

sitesnagger

Rule Path
Disallow /

webstripper

Rule Path
Disallow /

webcopier

Rule Path
Disallow /

fetch

Rule Path
Disallow /

offline explorer

Rule Path
Disallow /

teleport

Rule Path
Disallow /

teleportpro

Rule Path
Disallow /

webzip

Rule Path
Disallow /

linko

Rule Path
Disallow /

httrack

Rule Path
Disallow /

microsoft.url.control

Rule Path
Disallow /

xenu

Rule Path
Disallow /

larbin

Rule Path
Disallow /

libwww

Rule Path
Disallow /

zyborg

Rule Path
Disallow /

download ninja

Rule Path
Disallow /

wget

Rule Path
Disallow /

npbot

Rule Path
Disallow /

webreaper

Rule Path
Disallow /

ia_archiver

No rules defined. All paths allowed.

Comments

  • Sorry, wget in its recursive mode is a frequent problem.
  • Please read the man page and use it properly; there is a
  • --wait option you can use to set the delay between hits,
  • for instance.
  • Hits many times per second, not acceptable
  • http://www.nameprotect.com/botinfo.html
  • A capture bot, downloads gazillions of pages with no public benefit
  • http://www.webreaper.net/
  • Don't allow the wayback-maschine to index user-pages