lgcstandards.com
robots.txt

Robots Exclusion Standard data for lgcstandards.com

Archived Snapshots

Resource Scan

Scan Details

Site Domain	lgcstandards.com
Base Domain	lgcstandards.com
Scan Status	Ok
Last Scan	2025-12-23T04:59:23+00:00
Next Scan	2026-01-22T04:59:23+00:00

Last Scan

Scanned	2025-12-23T04:59:23+00:00
URL	https://lgcstandards.com/robots.txt
Redirect	https://www.lgcstandards.com/robots.txt
Redirect Domain	www.lgcstandards.com
Redirect Base	lgcstandards.com
Domain IPs	3.170.229.113, 3.170.229.28, 3.170.229.42, 3.170.229.56
Redirect IPs	3.170.229.113, 3.170.229.28, 3.170.229.42, 3.170.229.56
Response IP	3.170.229.42
Found	Yes
Hash	81191dcafd688918a48e7b995bb046e581d19f45d58a3402d56c68a15eef2b34
SimHash	70441797cdf0

Groups

*

Rule	Path
Disallow	/GB/en/cart
Disallow	/GB/en/checkout
Disallow	/GB/en/checkout/multi
Disallow	/GB/en/checkout/multi/quote
Disallow	/GB/en/my-account
Disallow	/GB/en/orderpt
Disallow	/GB/en/login
Disallow	/GB/en/search
Disallow	/GB/en/selectCustomerAccountLogin
Disallow	/GB/en/bulkOrder
Disallow	/GB/en/quickorder
Disallow	/GB/en/advsearch
Disallow	/GB/en/PageNotFound

Rule

Path

Disallow

/GB/en/cart

Disallow

/GB/en/checkout

Disallow

/GB/en/checkout/multi

Disallow

/GB/en/checkout/multi/quote

Disallow

/GB/en/my-account

Disallow

/GB/en/orderpt

Disallow

/GB/en/login

Disallow

/GB/en/search

Disallow

/GB/en/selectCustomerAccountLogin

Disallow

/GB/en/bulkOrder

Disallow

/GB/en/quickorder

Disallow

/GB/en/advsearch

Disallow

/GB/en/PageNotFound

Other Records

Field	Value	Comment
crawl-delay	10	10 seconds between page requests

Field

Value

Comment

crawl-delay

10

10 seconds between page requests

cazoodlebot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

mj12bot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

dotbot/1.0

Rule	Path
Disallow	/

Rule

Path

Disallow

/

gigabot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

Back to top

Other Records

Field	Value
sitemap	/GB/en/sitemap.xml

Field

Value

sitemap

/GB/en/sitemap.xml

Back to top

Comments

For all robots
Block access to specific groups of pages
Allow search crawlers to discover the sitemap
Block CazoodleBot as it does not present correct accept content headers
Block MJ12bot as it is just noise
Block dotbot as it cannot parse base urls properly
Block Gigabot

Back to top

Warnings

`request-rate` is not a known field.
`visit-time` is not a known field.

Back to top

lgcstandards.comrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

Other Records

cazoodlebot

mj12bot

dotbot/1.0

gigabot

Other Records

Comments

Warnings

lgcstandards.com
robots.txt