luxurywatchesusa.com
robots.txt

Robots Exclusion Standard data for luxurywatchesusa.com

Resource Scan

Scan Details

Site Domain luxurywatchesusa.com
Base Domain luxurywatchesusa.com
Scan Status Ok
Last Scan2025-03-16T05:27:06+00:00
Next Scan 2025-04-15T05:27:06+00:00

Last Scan

Scanned2025-03-16T05:27:06+00:00
URL https://luxurywatchesusa.com/robots.txt
Domain IPs 172.66.40.221, 172.66.43.35, 2606:4700:3108::ac42:28dd, 2606:4700:3108::ac42:2b23
Response IP 172.66.40.221
Found Yes
Hash 140d396bf4155ed9a70340be4ba982d0193db15e5166a5fa500a898b1114fcf1
SimHash 295371050995

Groups

*

Rule Path
Disallow /wp-content/plugins/
Disallow /wp-admin/
Disallow /readme.html
Disallow /refer/
Allow /wp-admin/admin-ajax.php
Disallow /wp-content/uploads/woo-import-export

qwantify

Rule Path
Disallow /

seekport crawler

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

boitho.com-dc

Rule Path
Disallow /

busiverse

Rule Path
Disallow /

cazoodlebot

Rule Path
Disallow /

exabot

Rule Path
Disallow /

heritrix

Rule Path
Disallow /

irlbot

Rule Path
Disallow /

jyxobot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

mirago-test-robot (http://www.miragorobot.com)

Rule Path
Disallow /

msnbot

Rule Path
Disallow /*.gif$
Disallow /*.jpeg$

nimblecrawler

Rule Path
Disallow /

psbot

Rule Path
Disallow /

sirketce

Rule Path
Disallow /

semanticdiscovery

Rule Path
Disallow /

sogou

Rule Path
Disallow /

soso

Rule Path
Disallow /

sosoimagespider

Rule Path
Disallow /

tineye

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

twiceler

Rule Path
Disallow /

voilabot

Rule Path
Disallow /

webalta

Rule Path
Disallow /

yodaobot

Rule Path
Disallow /

zermelo

Rule Path
Disallow /

Other Records

Field Value
sitemap https://luxurywatchesusa.com/sitemap_index.xml

Comments

  • Disallow Baidu Bot (Japanese)
  • Disallow Boitho dc Bot (Norway)
  • Disallow Busiverse Bot (Turkey Sirketce/Busiverse )
  • Disallow CazoodleBot - from University of Illinois
  • Disallow Exabot Bot - Exalead
  • Disallow heritrix Bot - from Yell.Com
  • Disallow IRLbot - IRL Texas AM research bot
  • Disallow Jyxobot - Czech Webcrawler for Jyxo
  • Disallow Majestic12.co.uk
  • Disallow Mirago.com
  • Disallow MSN from seeing gifs and jpgsd
  • Disallow NimbleCrawler (http://www.webmasterworld.com/forum93/858.htm)
  • Disallow psbot spidering of images and hub
  • Disallow Sirketce Bot (Turkey Sirketce/Busiverse )
  • Disallow semanticdiscovery - from Southern Utah University (compyter Science Dept.)
  • Disallow Sogou - Chinese Search Engine
  • Disallow SoSo - Chinese Search Engine
  • Disallow SoSoImageSpider - Chinese picture Search Engine
  • Disallow TinEye - Image trawler Search Engine
  • Disallow TurnITin - "This robot collects content from the Internet for the sole purpose of helping educational institutions prevent plagiarism"
  • Disallow Twiceler - Cuill (also Barred IPs on firewall)
  • Disallow Voilabot Bot - France Telecom
  • Disallow WebAlta Bot - Russian
  • Disallow YodaoBot - Chinese Search Engine
  • Disallow zermelo - Bot du Jour from Amazon - may need to block IP range