lanouvellerepublique.fr
robots.txt

Robots Exclusion Standard data for lanouvellerepublique.fr

Resource Scan

Scan Details

Site Domain lanouvellerepublique.fr
Base Domain lanouvellerepublique.fr
Scan Status Ok
Last Scan2024-10-26T19:39:15+00:00
Next Scan 2024-11-25T19:39:15+00:00

Last Scan

Scanned2024-10-26T19:39:15+00:00
URL https://lanouvellerepublique.fr/robots.txt
Redirect https://www.lanouvellerepublique.fr:443/robots.txt
Redirect Domain www.lanouvellerepublique.fr
Redirect Base lanouvellerepublique.fr
Domain IPs 52.50.95.211
Redirect IPs 2600:9000:2181:200:1d:d466:e0c0:93a1, 2600:9000:2181:2200:1d:d466:e0c0:93a1, 2600:9000:2181:4000:1d:d466:e0c0:93a1, 2600:9000:2181:8600:1d:d466:e0c0:93a1, 2600:9000:2181:b200:1d:d466:e0c0:93a1, 2600:9000:2181:c000:1d:d466:e0c0:93a1, 2600:9000:2181:d600:1d:d466:e0c0:93a1, 2600:9000:2181:fc00:1d:d466:e0c0:93a1, 65.9.112.118, 65.9.112.48, 65.9.112.85, 65.9.112.91
Response IP 52.85.49.86
Found Yes
Hash c91e032bf1ce69342234d246cdcfa61359535e2c964ea8a8b858d03675245cac
SimHash a8529503c5f5

Groups

duckduckbot
mediapartners-google
googlebot
googlebot-image
googlebot-mobile
googleproducer
googlebot-video
adsbot-google
googlebot_nauxeo
qwantify
qwant-news
voilabot
msnbot
slurp
bingbot
twitterbot
facebookexternalhit
applebot
bingbot
facebot
grapeshot
flipboard
flipboardproxy
weborama-fetcher
feedfetcher-google

Rule Path
Disallow /recherche
Disallow /backoffice
Disallow /mon-compte
Disallow /kiosque
Disallow /autour-de-moi
Disallow /contributeur
Disallow /annonces
Disallow /fr/
Disallow /api/
Allow /annonces/avis-de-deces/
Allow /api/v1/showcase
Allow /api/v1/rss/5c5d4592a7f67291298b456a
Allow /api/v1/rss/5c5d46dfa32027d4478b4567
Allow /api/v1/rss/5c5d41ce08cd953b7e8b4574
Allow /api/v1/rss/5c5d429800655ad45a8b4571
Allow /api/v1/rss/5c5d4612a7f672692a8b4575
Allow /api/v1/rss/5c5d4679e91a9623078b458d
Allow /api/v1/rss/592bf255489a4555008b4568

googlebot-news

Rule Path
Disallow /recherche
Disallow /backoffice
Disallow /mon-compte
Disallow /kiosque
Disallow /autour-de-moi
Disallow /contributeur
Disallow /annonces
Disallow /fr/
Disallow /api/
Disallow /annonces/avis-de-deces/
Allow /api/v1/showcase
Allow /api/v1/rss/5c5d4592a7f67291298b456a
Allow /api/v1/rss/5c5d46dfa32027d4478b4567
Allow /api/v1/rss/5c5d41ce08cd953b7e8b4574
Allow /api/v1/rss/5c5d429800655ad45a8b4571
Allow /api/v1/rss/5c5d4612a7f672692a8b4575
Allow /api/v1/rss/5c5d4679e91a9623078b458d

*

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.lanouvellerepublique.fr/sitemap.xml

Comments

  • robots.txt
  • This file is to prevent the crawling and indexing of certain parts
  • of your site by web crawlers and spiders run by sites like Yahoo!
  • and Google. By telling these "robots" where not to go on your site,
  • you save bandwidth and server resources.
  • This file will be ignored unless it is at the root of your host:
  • Used: http://example.com/robots.txt
  • Ignored: http://example.com/site/robots.txt
  • For more information about the robots.txt standard, see:
  • http://www.robotstxt.org/wc/robots.html
  • For syntax checking, see:
  • http://www.sxw.org.uk/computing/robots/check.html