lihlll.ca
robots.txt

Robots Exclusion Standard data for lihlll.ca

Archived Snapshots

Resource Scan

Scan Details

Site Domain	lihlll.ca
Base Domain	lihlll.ca
Scan Status	Failed
Failure Stage	Fetching resource.
Failure Reason	Couldn't connect to server.
Last Scan	2024-11-15T09:55:41+00:00
Next Scan	2025-02-13T09:55:41+00:00

Last Successful Scan

Scanned	2023-01-22T21:52:59+00:00
URL	https://lihlll.ca/robots.txt
Redirect	http://www.lihlll.ca/robots.txt
Redirect Domain	www.lihlll.ca
Redirect Base	lihlll.ca
Domain IPs	3.98.105.191
Redirect IPs	3.97.1.68, 3.98.81.84, 99.79.174.176
Response IP	3.97.1.68
Found	Yes
Hash	44c5867266bade2fa1e0e8a8078092b5e0bf7f459c6824daa484fc8e1ef08583
SimHash	ac1e1da87489

Groups

*

Rule	Path
Disallow	/*%7B%7B
Disallow	/*%7B%7B
Disallow	/*?SID=
Disallow	/*?no_cache=
Disallow	/*?nocache=
Disallow	/tmp/
Disallow	/vDev/
Disallow	/vPreprod/
Disallow	/vDemo/
Disallow	/vBBQC/
Disallow	/webmailAPIs/
Disallow	/ctr/
Disallow	/sponsors/
Disallow	/adpics/
Disallow	/vProd/iframeSession.php
Disallow	/v5/
Disallow	/v5dev/
Disallow	/chrysophylax/
Disallow	/ressources/files/
Disallow	/fr/ms/reseaupublicationsports/
Disallow	/en/ms/reseaupublicationsports/

Rule

Path

Disallow

/*%7B%7B

Disallow

/*%7B%7B

Disallow

/*?SID=

Disallow

/*?no_cache=

Disallow

/*?nocache=

Disallow

/tmp/

Disallow

/vDev/

Disallow

/vPreprod/

Disallow

/vDemo/

Disallow

/vBBQC/

Disallow

/webmailAPIs/

Disallow

/ctr/

Disallow

/sponsors/

Disallow

/adpics/

Disallow

/vProd/iframeSession.php

Disallow

/v5/

Disallow

/v5dev/

Disallow

/chrysophylax/

Disallow

/ressources/files/

Disallow

/fr/ms/reseaupublicationsports/

Disallow

/en/ms/reseaupublicationsports/

petalbot

Rule	Path
Disallow	/

Rule

Path

Disallow

ahrefsbot

Rule	Path
Disallow	/

Rule

Path

Disallow

mj12bot

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot

Rule	Path
Disallow	/

Rule

Path

Disallow

seekport

Rule	Path
Disallow	/

Rule

Path

Disallow

Comments

Do not crawl javascript links with {{token}}
Do not crawl links with ?no_cache
Disallow Bad bots

lihlll.carobots.txt

Resource Scan

Scan Details

Last Successful Scan

Groups

*

petalbot

ahrefsbot

mj12bot

semrushbot

seekport

Comments

lihlll.ca
robots.txt