sickcn.com
robots.txt

Robots Exclusion Standard data for sickcn.com

Resource Scan

Scan Details

Site Domain sickcn.com
Base Domain sickcn.com
Scan Status Ok
Last Scan2024-10-02T04:15:22+00:00
Next Scan 2024-11-01T04:15:22+00:00

Last Scan

Scanned2024-10-02T04:15:22+00:00
URL https://www.sickcn.com/robots.txt
Redirect https://www.sick.com/cn/en/robots.txt
Redirect Domain www.sick.com
Redirect Base sick.com
Domain IPs 80.72.131.66
Redirect IPs 96.17.96.11, 96.17.96.19
Response IP 23.44.4.160
Found Yes
Hash ee1d950d8ed467d66cb1f56fb133645eff269d13725aac505b090ea189a24eb8
SimHash 28469ff6c9fa

Groups

*

Rule Path
Disallow /cn/en/cart
Disallow /cn/en/checkout
Disallow /cn/en/my-account
Disallow /cn/en/my-company
Disallow /cn/en/compare
Disallow /cn/en/search

cazoodlebot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

dotbot/1.0

Rule Path
Disallow /

gigabot

Rule Path
Disallow /

Other Records

Field Value
sitemap /cn/en/sitemap.xml

Comments

  • For all robots
  • Block access to specific groups of pages
  • Allow search crawlers to discover the sitemap
  • Block CazoodleBot as it does not present correct accept content headers
  • Block MJ12bot as it is just noise
  • Block dotbot as it cannot parse base urls properly
  • Block Gigabot