hockeycollegial.com
robots.txt

Robots Exclusion Standard data for hockeycollegial.com

Resource Scan

Scan Details

Site Domain hockeycollegial.com
Base Domain hockeycollegial.com
Scan Status Failed
Failure StageFetching resource.
Failure ReasonCouldn't connect to server.
Last Scan2024-07-07T19:19:42+00:00
Next Scan 2024-10-05T19:19:42+00:00

Last Successful Scan

Scanned2022-05-13T05:15:45+00:00
URL http://hockeycollegial.com/robots.txt
Redirect http://www.hockeycollegial.com/robots.txt
Redirect Domain www.hockeycollegial.com
Redirect Base hockeycollegial.com
Response IP 3.97.1.68
Found Yes
Hash 44c5867266bade2fa1e0e8a8078092b5e0bf7f459c6824daa484fc8e1ef08583
SimHash ac1e1da87489

Groups

*

Rule Path
Disallow /*%7B%7B
Disallow /*%7B%7B
Disallow /*?SID=
Disallow /*?no_cache=
Disallow /*?nocache=
Disallow /tmp/
Disallow /vDev/
Disallow /vPreprod/
Disallow /vDemo/
Disallow /vBBQC/
Disallow /webmailAPIs/
Disallow /ctr/
Disallow /sponsors/
Disallow /adpics/
Disallow /vProd/iframeSession.php
Disallow /v5/
Disallow /v5dev/
Disallow /chrysophylax/
Disallow /ressources/files/
Disallow /fr/ms/reseaupublicationsports/
Disallow /en/ms/reseaupublicationsports/

petalbot

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

seekport

Rule Path
Disallow /

Comments

  • Do not crawl javascript links with {{token}}
  • Do not crawl links with ?no_cache
  • Disallow Bad bots