crlaurence.ca
robots.txt

Robots Exclusion Standard data for crlaurence.ca

Archived Snapshots

Resource Scan

Scan Details

Site Domain	crlaurence.ca
Base Domain	crlaurence.ca
Scan Status	Ok
Last Scan	2025-09-08T16:56:36+00:00
Next Scan	2025-10-08T16:56:36+00:00

Last Scan

Scanned	2025-09-08T16:56:36+00:00
URL	https://crlaurence.ca/robots.txt
Redirect	https://www.crlaurence.ca/robots.txt
Redirect Domain	www.crlaurence.ca
Redirect Base	crlaurence.ca
Domain IPs	13.107.246.38, 2620:1ec:29:1::59
Redirect IPs	13.107.253.59, 2620:1ec:29:1::59
Response IP	13.107.253.59
Found	Yes
Hash	ba03b5395b39ed91d8a45deb6ae9973d4680620dd5a00c8e47c2a0062962d6c3
SimHash	3c550716effa

Groups

*

Rule	Path
Disallow	/cart
Disallow	/checkout
Disallow	/my-account

Rule

Path

Disallow

/cart

Disallow

/checkout

Disallow

/my-account

cazoodlebot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

mj12bot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

dotbot/1.0

Rule	Path
Disallow	/

Rule

Path

Disallow

/

gigabot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

semrushbot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

bingbot

Rule	Path
Disallow

Rule

Path

Disallow

petalbot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

anthropic-ai

Rule	Path
Disallow	/

Rule

Path

Disallow

/

claudebot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

Back to top

Other Records

Field	Value
sitemap	https://www.crlaurence.ca/sitemap.xml

Field

Value

sitemap

https://www.crlaurence.ca/sitemap.xml

Back to top

Comments

For all robots
Block access to specific groups of pages
Allow search crawlers to discover the sitemap
Block CazoodleBot as it does not present correct accept content headers
Block MJ12bot as it is just noise
Block dotbot as it cannot parse base urls properly
Block Gigabot
Block SemrushBot
Block Bingbot
Block PetalBot

Back to top

crlaurence.carobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

cazoodlebot

mj12bot

dotbot/1.0

gigabot

semrushbot

bingbot

petalbot

anthropic-ai

claudebot

Other Records

Comments

crlaurence.ca
robots.txt