plowandhearth.com
robots.txt

Robots Exclusion Standard data for plowandhearth.com

Resource Scan

Scan Details

Site Domain plowandhearth.com
Base Domain plowandhearth.com
Scan Status Failed
Failure ReasonScan timed out.
Last Scan2024-11-05T04:53:25+00:00
Next Scan 2024-11-19T04:53:25+00:00

Last Successful Scan

Scanned2024-09-28T04:52:56+00:00
URL https://plowandhearth.com/robots.txt
Redirect https://www.plowhearth.com/robots.txt
Redirect Domain www.plowhearth.com
Redirect Base plowhearth.com
Domain IPs 151.101.130.132, 151.101.194.132, 151.101.2.132, 151.101.66.132
Redirect IPs 151.101.130.132, 151.101.194.132, 151.101.2.132, 151.101.66.132
Response IP 151.101.130.132
Found Yes
Hash 2c2d7f9fdca09f1b6cc6e4c417c02c1d45d35a941a6f22cec177eb7963f7904b
SimHash 2c545f16cff9

Groups

*

Rule Path
Disallow /cart
Disallow /checkout
Disallow /my-account

Other Records

Field Value
crawl-delay 10

cazoodlebot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

dotbot/1.0

Rule Path
Disallow /

gigabot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://ph.czpcsmvyqi-plowandhe1-p1-public.model-t.cc.commerce.ondemand.com/sitemap.xml

Comments

  • For all robots
  • Block access to specific groups of pages
  • Allow search crawlers to discover the sitemap
  • Block CazoodleBot as it does not present correct accept content headers
  • Block MJ12bot as it is just noise
  • Block dotbot as it cannot parse base urls properly
  • Block Gigabot