whsmith.co.uk
robots.txt

Robots Exclusion Standard data for whsmith.co.uk

Resource Scan

Scan Details

Site Domain whsmith.co.uk
Base Domain whsmith.co.uk
Scan Status Ok
Last Scan2024-04-21T15:46:30+00:00
Next Scan 2024-05-21T15:46:30+00:00

Last Scan

Scanned2024-04-21T15:46:30+00:00
URL https://www.whsmith.co.uk/robots.txt
Domain IPs 18.155.68.101, 18.155.68.125, 18.155.68.19, 18.155.68.2
Response IP 18.155.68.19
Found Yes
Hash 8ef0d2f2539b2f374196da820914b1fd9fdd7b222a0a0747de0d852968f8388b
SimHash 44b67240f123

Groups

*

Rule Path
Disallow /on/demandware.store/
Disallow */s/whsmith/dw/
Disallow /register$
Disallow /register/
Disallow /login$
Disallow /login/
Disallow /my-profile$
Disallow /my-profile/
Disallow /account$
Disallow /account/
Disallow /account-editpassword
Disallow /forgotten-password
Disallow /address-book/
Disallow /address-book$
Disallow /shopping-basket
Disallow /shopping-basket/
Disallow /checkout-login
Disallow /checkout-shipping
Disallow /checkout-payment
Disallow /checkout-confirm
Disallow /orders$
Disallow /orders?
Disallow /orders/
Disallow /search/
Disallow /search?q=
Disallow *search?keywords=
Disallow */search?cgid
Disallow /?q=
Disallow *?c_
Disallow *?brand=
Disallow *?price=
Disallow *?srule=
Disallow *?view=
Disallow */product-image/zoom/

Other Records

Field Value
crawl-delay 1

Other Records

Field Value
sitemap https://www.whsmith.co.uk/sitemap_index.xml

Comments

  • amended 5 January 2023
  • User Pages
  • Search results
  • Refine Filters
  • Images
  • Crawl Delay - 1 URL max per second [Google ignores this]