lechorepublicain.fr
robots.txt

Robots Exclusion Standard data for lechorepublicain.fr

Resource Scan

Scan Details

Site Domain lechorepublicain.fr
Base Domain lechorepublicain.fr
Scan Status Ok
Last Scan2024-05-29T22:00:58+00:00
Next Scan 2024-06-05T22:00:58+00:00

Last Scan

Scanned2024-05-29T22:00:58+00:00
URL https://lechorepublicain.fr/robots.txt
Redirect https://www.lechorepublicain.fr/robots.txt
Redirect Domain www.lechorepublicain.fr
Redirect Base lechorepublicain.fr
Domain IPs 104.18.20.32
Redirect IPs 104.18.20.32, 104.18.21.32, 2606:4700::6812:1420, 2606:4700::6812:1520
Response IP 104.18.21.32
Found Yes
Hash 2ac4b5aba5185f0fc3d4f60369e05ae1a93b41bf6262328e01b5d3289d66b7c0
SimHash 191f15a661f2

Groups

*

Rule Path
Disallow /captcha.png
Disallow /json-rpc
Disallow /ajax
Disallow /idalgo
Disallow /place-publique
Disallow /*.php
Disallow /wp-
Disallow /recherche
Disallow /archivev2
Disallow /widgetRss
Disallow /*?widgetRss
Disallow /*GCF_
Disallow /region/
Disallow /loisirs/agenda/image
Disallow /abonnement/integrale-mt-1690
Disallow /abonnement/integrale-pc-1690
Disallow /abonnement/integrale-rc-1690
Disallow /abonnement/integrale-yr-1690
Disallow /abonnement/integrale-er-1690
Disallow /abonnement/integrale-jc-1690
Disallow /abonnement/integrale-ev-1690
Disallow /abonnement/integrale-br-1690
Disallow /abonnement/integrale-annuel-mt-1690
Disallow /abonnement/integrale-annuel-pc-1690
Disallow /abonnement/integrale-annuel-rc-1690
Disallow /abonnement/integrale-annuel-yr-1690
Disallow /abonnement/integrale-annuel-er-1690
Disallow /abonnement/integrale-annuel-jc-1690
Disallow /abonnement/integrale-annuel-ev-1690
Disallow /abonnement/integrale-annuel-br-1690
Disallow /front/

gptbot

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /