hoangweb.net
robots.txt

Robots Exclusion Standard data for hoangweb.net

Resource Scan

Scan Details

Site Domain hoangweb.net
Base Domain hoangweb.net
Scan Status Ok
Last Scan2024-11-12T12:16:39+00:00
Next Scan 2024-11-19T12:16:39+00:00

Last Scan

Scanned2024-11-12T12:16:39+00:00
URL https://hoangweb.net/robots.txt
Domain IPs 104.21.38.191, 172.67.137.236, 2606:4700:3035::6815:26bf, 2606:4700:3035::ac43:89ec
Response IP 104.21.38.191
Found Yes
Hash 700706fcc4ce6d43c022e71ddb8cf247cfd8944ffaf8d475cc7065da166259f4
SimHash f74fd8424730

Groups

easouspider
ezooms
mj12bot
sitesucker
httrack
httrack website copier
teleport
teleportpro
emailcollector
emailsiphon
webbandit
webzip
webreaper
webstripper
web downloader
webcopier
offline explorer pro
offline commander
leech
websnake
blackwidow
http weazel

Rule Path
Disallow /

nutch

Rule Path
Disallow /

mj12bot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

Comments

  • protect my site from HTTrack or other software's ripping?

Warnings

  • 5 invalid lines.