idec.com
robots.txt

Robots Exclusion Standard data for idec.com

Resource Scan

Scan Details

Site Domain idec.com
Base Domain idec.com
Scan Status Ok
Last Scan2024-09-04T20:50:55+00:00
Next Scan 2024-10-04T20:50:55+00:00

Last Scan

Scanned2024-09-04T20:50:55+00:00
URL https://idec.com/robots.txt
Redirect https://www.idec.com/robots.txt
Redirect Domain www.idec.com
Redirect Base idec.com
Domain IPs 23.215.7.10, 23.215.7.15, 2600:1413:b000:1b::17d7:70a, 2600:1413:b000:1b::17d7:70f
Redirect IPs 23.215.7.10, 23.215.7.15, 2600:1413:b000:1b::17d7:70a, 2600:1413:b000:1b::17d7:70f
Response IP 96.17.180.46
Found Yes
Hash 4079aef293f5ff77633e7f667ffee3fc9c42080f02c986fa79abe4750b1df8de
SimHash 385777b4eef2

Groups

*

Rule Path
Disallow /en/cart
Disallow /en/checkout
Disallow /en/my-account
Disallow /en/my-company

cazoodlebot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

dotbot/1.0

Rule Path
Disallow /

gigabot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.idec.com/en/sitemap.xml

Comments

  • For all robots
  • Block access to specific groups of pages
  • Allow search crawlers to discover the sitemap
  • Block CazoodleBot as it does not present correct accept content headers
  • Block MJ12bot as it is just noise
  • Block dotbot as it cannot parse base urls properly
  • Block Gigabot
  • Block ChatGPT