theravada.vn
robots.txt

Robots Exclusion Standard data for theravada.vn

Resource Scan

Scan Details

Site Domain theravada.vn
Base Domain theravada.vn
Scan Status Ok
Last Scan2024-09-17T05:29:51+00:00
Next Scan 2024-10-17T05:29:51+00:00

Last Scan

Scanned2024-09-17T05:29:51+00:00
URL https://theravada.vn/robots.txt
Domain IPs 104.21.27.168, 172.67.169.147, 2606:4700:3037::6815:1ba8, 2606:4700:3037::ac43:a993
Response IP 172.67.169.147
Found Yes
Hash f7837bb4018648e72160199eda4a027ffffdf7d33797ee58bb25613b6cfc54ac
SimHash f32dd8526db1

Groups

*

Rule Path
Disallow /wp-admin/
Allow /wp-admin/admin-ajax.php
Allow /wp-admin/images/*
Disallow /wp-includes/
Allow /wp-includes/js
Allow /wp-includes/css
Disallow /trackback/
Disallow /xmlrpc.php
Disallow /feed/
Disallow /*/feed/*
Disallow /*/*?s=*

*

Rule Path
Disallow /wp-content/cache/

ahrefsbot
baiduspider
easouspider
ezooms
yandexbot
mj12bot
sitesucker
httrack
httrack website copier
teleport
teleportpro
emailcollector
emailsiphon
webbandit
webzip
webreaper
webstripper
web downloader
webcopier
offline explorer pro
offline commander
leech
websnake
blackwidow
http weazel
barkrowler/0.9
barkrowler
nutch

Rule Path
Disallow /

mj12bot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 1

pinerest

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 1

Comments

  • BEGIN W3TC ROBOTS
  • END W3TC ROBOTS
  • protect my site from HTTrack or other software's ripping?