lapmang.vn
robots.txt

Robots Exclusion Standard data for lapmang.vn

Resource Scan

Scan Details

Site Domain lapmang.vn
Base Domain lapmang.vn
Scan Status Ok
Last Scan2024-11-18T20:22:58+00:00
Next Scan 2024-12-18T20:22:58+00:00

Last Scan

Scanned2024-11-18T20:22:58+00:00
URL https://lapmang.vn/robots.txt
Domain IPs 103.57.221.26
Response IP 103.57.221.26
Found Yes
Hash 3d4d73429a09ad6b2cb5fc0d3849b4eaab2d966c2ff061844f54fefdc23e433a
SimHash 530dd8524fb1

Groups

*

Rule Path
Disallow /wp-admin/
Allow /wp-admin/admin-ajax.php
Allow /wp-admin/images/*
Disallow /wp-includes/
Allow /wp-includes/js
Allow /wp-includes/css
Disallow /trackback/
Disallow /xmlrpc.php
Disallow /feed/
Disallow /*/feed/*
Disallow /*/*?s=*
Disallow /*/*.inc$
Disallow /transfer/
Disallow /refer/
Disallow /*/cgi-bin/*
Disallow /*/blackhole/*
Disallow /*/trackback/*
Disallow /*/xmlrpc.php
Disallow /suggest/?*
Disallow /readme.html

ahrefsbot
baiduspider
easouspider
ezooms
yandexbot
mj12bot
sitesucker
httrack
httrack website copier
teleport
teleportpro
emailcollector
emailsiphon
webbandit
webzip
webreaper
webstripper
web downloader
webcopier
offline explorer pro
offline commander
leech
websnake
blackwidow
http weazel

Rule Path
Disallow /

nutch

Rule Path
Disallow /

mj12bot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

pinerest

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 1

ia_archiver

Rule Path
Disallow /

Other Records

Field Value
sitemap https://lapmang.vn/sitemap.xml

Comments

  • Disallow: /wp-content/plugins/
  • Disallow: /wp-content/themes/
  • protect my site from HTTrack or other software's ripping?