france-emploi.com
robots.txt

Robots Exclusion Standard data for france-emploi.com

Resource Scan

Scan Details

Site Domain france-emploi.com
Base Domain france-emploi.com
Scan Status Ok
Last Scan2024-11-14T21:12:37+00:00
Next Scan 2024-11-21T21:12:37+00:00

Last Scan

Scanned2024-11-14T21:12:37+00:00
URL https://france-emploi.com/robots.txt
Redirect https://emploi.ouest-france.fr/robots.txt
Redirect Domain emploi.ouest-france.fr
Redirect Base ouest-france.fr
Domain IPs 217.70.184.55
Redirect IPs 118.215.80.113, 2600:1413:b000:380::30db, 2600:1413:b000:386::30db
Response IP 118.215.80.113
Found Yes
Hash 4e2bab0078e35f3681d3d56f49d69f62337c2c1c44447fd15d038d37cd432224
SimHash 641c158167f1

Groups

adsbot-google
adsbot-google-mobile
applebot
bingbot
duckduckbot
facebookexternalhit
facebot
googlebot
googlebot-image
googlebot-mobile
googlebot-news
storebot-google
apis-google
linkedinbot
mediapartners-google
oncrawl
orangebot
orangebot-collector
pinterest
qwantify
searchmetricsbot
semrushbot
slurp
taboolabot
twitterbot
twitterbot
oncrawl
facebot
twitterbot
facebookexternalhit
pinterest
linkedinbot
semrushbot

Rule Path
Disallow /*sort%3D*
Disallow /*q%3D*
Disallow /*?recherche=
Disallow /*?entreprise=
Disallow /*?localisation=
Disallow /*?secteur=
Disallow /*?partners=
Disallow /*%2B*
Disallow *?preview=true
Disallow /*at_*
Disallow /*aggregate
Disallow */candidat/

*

Rule Path
Allow ads.txt
Disallow /

Other Records

Field Value
sitemap https://emploi.ouest-france.fr/sitemap/sitemap.xml

Comments

  • It is forbidden to use web crawlers or other automatic methods to browse this website
  • Crawling/Scraping is only allowed with special permission from Ouest France Multimedia
  • In another way, we forbid to crawl our website using a stolen user agent which does not match your identity
  • Cf. loi godfrain - Nº 88-19 du 5 janvier 1988 article 462-3 et Art.323-2 du code pénal relatifs aux conséquences pénales de l'altération du fonctionnement d'un système informatique
  • You can find robots.txt RFC at http://www.robotstxt.org/norobots-rfc.txt
  • Allowed search engines directives
  • Sitemaps