epson.com
robots.txt

Robots Exclusion Standard data for epson.com

Resource Scan

Scan Details

Site Domain epson.com
Base Domain epson.com
Scan Status Ok
Last Scan2024-09-25T03:34:53+00:00
Next Scan 2024-10-25T03:34:53+00:00

Last Scan

Scanned2024-09-25T03:34:53+00:00
URL https://epson.com/robots.txt
Domain IPs 45.60.106.158, 45.60.45.158
Response IP 45.60.106.158
Found Yes
Hash 5a16202e0a86274bd0dfc0ecd721666e489f069c752169363f152bf70a06b455
SimHash ae6633d8cdf4

Groups

*

Rule Path
Disallow /cart
Disallow /checkout
Disallow /my-account
Disallow /search
Disallow /supportsearch
Disallow /faqsearch
Disallow /Product-Exclusion
Disallow /Exclusion-folder-for-ink
Disallow /Epson-Customer-Appreciation-Program
Disallow /oidc
Disallow /login/sign-up
Disallow /notify
Disallow /dealerlocator
Disallow /servicelocator
Disallow /*?q=*
Disallow /*%26q%3D*
Disallow /*?bvroute=*
Disallow /*%26bvroute%3D*
Disallow /*?bvstate=*
Disallow /*%26bvstate%3D*

cazoodlebot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

dotbot/1.0

Rule Path
Disallow /

gigabot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://ftp.epson.com/marketing/us-sitemap.xml

Comments

  • For all robots
  • Block access to specific groups of pages
  • Allow search crawlers to discover the sitemap
  • Block CazoodleBot as it does not present correct accept content headers
  • Block MJ12bot as it is just noise
  • Block dotbot as it cannot parse base urls properly
  • Block Gigabot

Warnings

  • `request-rate` is not a known field.
  • `visit-time` is not a known field.