hoangsoncomputer.com
robots.txt

Robots Exclusion Standard data for hoangsoncomputer.com

Resource Scan

Scan Details

Site Domain hoangsoncomputer.com
Base Domain hoangsoncomputer.com
Scan Status Ok
Last Scan2024-10-26T16:03:07+00:00
Next Scan 2024-11-25T16:03:07+00:00

Last Scan

Scanned2024-10-26T16:03:07+00:00
URL https://hoangsoncomputer.com/robots.txt
Domain IPs 45.252.249.19
Response IP 45.252.249.19
Found Yes
Hash 4965a4159d83ff4aa31d699850ddd7b2bdd30e2d7b6f3802b062aaeca74f4ee8
SimHash 530dd8526d91

Groups

*

Rule Path
Disallow /wp-admin/
Allow /wp-admin/admin-ajax.php
Allow /wp-admin/images/*
Disallow /wp-includes/
Allow /wp-includes/js
Allow /wp-includes/css
Disallow /trackback/
Disallow /xmlrpc.php
Disallow /feed/
Disallow /*/feed/*
Disallow /*/*?s=*
Disallow /*/*.inc$
Disallow /transfer/
Disallow /refer/
Disallow /*/cgi-bin/*
Disallow /*/blackhole/*
Disallow /*/trackback/*
Disallow /*/xmlrpc.php
Disallow /suggest/?*
Disallow /readme.html
Disallow /*?hpp_next=*

ahrefsbot
baiduspider
easouspider
ezooms
yandexbot
mj12bot
sitesucker
httrack
httrack website copier
teleport
teleportpro
emailcollector
emailsiphon
webbandit
webzip
webreaper
webstripper
web downloader
webcopier
offline explorer pro
offline commander
leech
websnake
blackwidow
http weazel

Rule Path
Disallow /

nutch

Rule Path
Disallow /

mj12bot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

pinerest

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 1

ia_archiver

Rule Path
Disallow /

Other Records

Field Value
sitemap https://hoangsoncomputer.com/sitemap_index.xml
sitemap https://hoangsoncomputer.com/post-sitemap.xml
sitemap https://hoangsoncomputer.com/page-sitemap.xml
sitemap https://hoangsoncomputer.com/product-sitemap.xml
sitemap https://hoangsoncomputer.com/category-sitemap.xml
sitemap https://hoangsoncomputer.com/post_tag-sitemap.xml
sitemap https://hoangsoncomputer.com/product_cat-sitemap.xml

Comments

  • Disallow: /wp-content/plugins/
  • Disallow: /wp-content/themes/
  • protect my site from HTTrack or other software's ripping?
  • https://hoangsoncomputer.com/sitemap.xml