jmclaughlin.com
robots.txt

Robots Exclusion Standard data for jmclaughlin.com

Resource Scan

Scan Details

Site Domain jmclaughlin.com
Base Domain jmclaughlin.com
Scan Status Ok
Last Scan2024-05-27T22:26:26+00:00
Next Scan 2024-06-10T22:26:26+00:00

Last Scan

Scanned2024-05-27T22:26:26+00:00
URL https://jmclaughlin.com/robots.txt
Redirect https://www.jmclaughlin.com/robots.txt
Redirect Domain www.jmclaughlin.com
Redirect Base jmclaughlin.com
Domain IPs 23.227.38.65
Redirect IPs 23.227.38.74, 2620:127:f00f:e::
Response IP 23.227.38.74
Found Yes
Hash f08b8dbf97602ba896f16bc936e553a922dcab55e05018c55b8a0a55aeb3fcc4
SimHash 0f909f2a5587

Groups

*

Rule Path
Allow /sitemap.xml
Disallow /admin
Disallow /cart
Disallow /orders
Disallow /checkouts/
Disallow /checkout
Disallow /carts
Disallow /account
Disallow /style-guide
Disallow /contact-us
Disallow /account/login
Disallow /account/recover
Disallow /account/register
Disallow /pages/request-a-catalog
Disallow /pages/unsubscribe
Disallow /pages/email-preferences

Other Records

Field Value
crawl-delay 1

adsbot-google

Rule Path
Disallow /admin
Disallow /cart
Disallow /orders
Disallow /checkouts/
Disallow /checkout
Disallow /carts
Disallow /account
Disallow /style-guide
Disallow /contact-us
Disallow /account/login
Disallow /account/recover
Disallow /account/register
Disallow /pages/request-a-catalog
Disallow /pages/unsubscribe
Disallow /pages/email-preferences

Other Records

Field Value
sitemap https://www.jmclaughlin.com/sitemap.xml

Comments

  • Crawlers Setup
  • Allowable Index
  • Paths
  • Google adsbot ignores robots.txt unless specifically named!