scrubscanada.ca
robots.txt

Robots Exclusion Standard data for scrubscanada.ca

Archived Snapshots

Resource Scan

Scan Details

Site Domain	scrubscanada.ca
Base Domain	scrubscanada.ca
Scan Status	Ok
Last Scan	2026-03-11T20:40:40+00:00
Next Scan	2026-04-10T20:40:40+00:00

Last Scan

Scanned	2026-03-11T20:40:40+00:00
URL	https://scrubscanada.ca/robots.txt
Domain IPs	104.21.52.102, 172.67.198.18, 2606:4700:3035::6815:3466, 2606:4700:3037::ac43:c612
Response IP	172.67.198.18
Found	Yes
Hash	c593ed63b19dbbb5313bd1a8cae302275df3e98aac8d8489e9b72a751f3bab18
SimHash	c23d2111c515

Groups

*

Rule	Path
Allow	/

Rule

Path

Allow

amazonbot

Rule	Path
Disallow	/

Rule

Path

Disallow

applebot-extended

Rule	Path
Disallow	/

Rule

Path

Disallow

bytespider

Rule	Path
Disallow	/

Rule

Path

Disallow

ccbot

Rule	Path
Disallow	/

Rule

Path

Disallow

claudebot

Rule	Path
Disallow	/

Rule

Path

Disallow

google-extended

Rule	Path
Disallow	/

Rule

Path

Disallow

gptbot

Rule	Path
Disallow	/

Rule

Path

Disallow

meta-externalagent

Rule	Path
Disallow	/

Rule

Path

Disallow

petalbot disallow: /
baiduspider

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot

Rule	Path
Disallow	/

Rule

Path

Disallow

meta-externalagent

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot-ba

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot-si

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot-swa

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot-ct

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot-bm

Rule	Path
Disallow	/

Rule

Path

Disallow

splitsignalbot

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot-coub

Rule	Path
Disallow	/

Rule

Path

Disallow

seekportbot

Rule	Path
Disallow	/

Rule

Path

Disallow

gptbot

Rule	Path
Disallow	/

Rule

Path

Disallow

seekport crawler

Rule	Path
Disallow	/

Rule

Path

Disallow

ahrefsbot

Rule	Path
Disallow	/

Rule

Path

Disallow

mj12bot

Rule	Path
Disallow	/

Rule

Path

Disallow

dotbot

Rule	Path
Disallow	/

Rule

Path

Disallow

yandex

Rule	Path
Disallow	/

Rule

Path

Disallow

baiduspider-video disallow: /
ahrefssiteaudit disallow: /
baiduspider-image

Rule	Path
Disallow	/

Rule

Path

Disallow

amazonbot

Rule	Path
Disallow	/

Rule

Path

Disallow

facebookexternalhit

Rule	Path
Disallow	/
Disallow	/*orderby%3D
Disallow	/*orderway%3D
Disallow	/*tag%3D
Disallow	/*id_currency%3D
Disallow	/*search_query%3D
Disallow	/*back%3D
Disallow	/*n%3D
Disallow	/*controller%3Daddresses
Disallow	/*controller%3Daddress
Disallow	/*controller%3Dauthentication
Disallow	/*controller%3Dcart
Disallow	/*controller%3Ddiscount
Disallow	/*controller%3Dfooter
Disallow	/*controller%3Dget-file
Disallow	/*controller%3Dheader
Disallow	/*controller%3Dhistory
Disallow	/*controller%3Didentity
Disallow	/*controller%3Dimages.inc
Disallow	/*controller%3Dinit
Disallow	/*controller%3Dmy-account
Disallow	/*controller%3Dorder
Disallow	/*controller%3Dorder-opc
Disallow	/*controller%3Dorder-slip
Disallow	/*controller%3Dorder-detail
Disallow	/*controller%3Dorder-follow
Disallow	/*controller%3Dorder-return
Disallow	/*controller%3Dorder-confirmation
Disallow	/*controller%3Dpagination
Disallow	/*controller%3Dpassword
Disallow	/*controller%3Dpdf-invoice
Disallow	/*controller%3Dpdf-order-return
Disallow	/*controller%3Dpdf-order-slip
Disallow	/*controller%3Dproduct-sort
Disallow	/*controller%3Dsearch
Disallow	/*controller%3Dstatistics
Disallow	/*controller%3Dattachment
Disallow	/*controller%3Dguest-tracking
Disallow	*/classes/
Disallow	*/config/
Disallow	*/download/
Disallow	*/mails/
Disallow	*/modules/
Disallow	*/translations/
Disallow	*/tools/
Disallow	*/testmenu/

Rule

Path

Disallow

/*orderby%3D

Disallow

/*orderway%3D

Disallow

/*tag%3D

Disallow

/*id_currency%3D

Disallow

/*search_query%3D

Disallow

/*back%3D

Disallow

/*n%3D

Disallow

/*controller%3Daddresses

Disallow

/*controller%3Daddress

Disallow

/*controller%3Dauthentication

Disallow

/*controller%3Dcart

Disallow

/*controller%3Ddiscount

Disallow

/*controller%3Dfooter

Disallow

/*controller%3Dget-file

Disallow

/*controller%3Dheader

Disallow

/*controller%3Dhistory

Disallow

/*controller%3Didentity

Disallow

/*controller%3Dimages.inc

Disallow

/*controller%3Dinit

Disallow

/*controller%3Dmy-account

Disallow

/*controller%3Dorder

Disallow

/*controller%3Dorder-opc

Disallow

/*controller%3Dorder-slip

Disallow

/*controller%3Dorder-detail

Disallow

/*controller%3Dorder-follow

Disallow

/*controller%3Dorder-return

Disallow

/*controller%3Dorder-confirmation

Disallow

/*controller%3Dpagination

Disallow

/*controller%3Dpassword

Disallow

/*controller%3Dpdf-invoice

Disallow

/*controller%3Dpdf-order-return

Disallow

/*controller%3Dpdf-order-slip

Disallow

/*controller%3Dproduct-sort

Disallow

/*controller%3Dsearch

Disallow

/*controller%3Dstatistics

Disallow

/*controller%3Dattachment

Disallow

/*controller%3Dguest-tracking

Disallow

*/classes/

Disallow

*/config/

Disallow

*/download/

Disallow

*/mails/

Disallow

*/modules/

Disallow

*/translations/

Disallow

*/tools/

Disallow

*/testmenu/

Comments

As a condition of accessing this website, you agree to abide by the following
content signals:
(a) If a Content-Signal = yes, you may collect content for the corresponding
use.
(b) If a Content-Signal = no, you may not collect content for the
corresponding use.
(c) If the website operator does not include a Content-Signal for a
corresponding use, the website operator neither grants nor restricts
permission via Content-Signal with respect to the corresponding use.
The content signals and their meanings are:
search: building a search index and providing search results (e.g., returning
hyperlinks and short excerpts from your website's contents). Search does not
include providing AI-generated search summaries.
ai-input: inputting content into one or more AI models (e.g., retrieval
augmented generation, grounding, or other real-time taking of content for
generative AI search answers).
ai-train: training or fine-tuning AI models.
ANY RESTRICTIONS EXPRESSED VIA CONTENT SIGNALS ARE EXPRESS RESERVATIONS OF
AND RELATED RIGHTS IN THE DIGITAL SINGLE MARKET.
BEGIN Cloudflare Managed content
END Cloudflare Managed Content
robots.txt automaticaly generated by PrestaShop e-commerce open-source solution
http://www.prestashop.com - http://www.prestashop.com/forums
This file is to prevent the crawling and indexing of certain parts
of your site by web crawlers and spiders run by sites like Yahoo!
and Google. By telling these "robots" where not to go on your site,
you save bandwidth and server resources.
For more information about the robots.txt standard, see:
http://www.robotstxt.org/wc/robots.html
Private pages
Directories

Warnings

`content-signal` is not a known field.

scrubscanada.carobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

amazonbot

applebot-extended

bytespider

ccbot

claudebot

google-extended

gptbot

meta-externalagent

petalbot disallow: /baiduspider

semrushbot

meta-externalagent

semrushbot-ba

semrushbot-si

semrushbot-swa

semrushbot-ct

semrushbot-bm

splitsignalbot

semrushbot-coub

seekportbot

gptbot

seekport crawler

ahrefsbot

mj12bot

dotbot

yandex

baiduspider-video disallow: /ahrefssiteaudit disallow: /baiduspider-image

amazonbot

facebookexternalhit

Comments

Warnings

scrubscanada.ca
robots.txt

petalbot disallow: /
baiduspider

baiduspider-video disallow: /
ahrefssiteaudit disallow: /
baiduspider-image