ouestfrance-emploi.com
robots.txt

Robots Exclusion Standard data for ouestfrance-emploi.com

Resource Scan

Scan Details

Site Domain ouestfrance-emploi.com
Base Domain ouestfrance-emploi.com
Scan Status Ok
Last Scan2024-06-23T16:07:19+00:00
Next Scan 2024-06-30T16:07:19+00:00

Last Scan

Scanned2024-06-23T16:07:19+00:00
URL https://www.ouestfrance-emploi.com/robots.txt
Redirect https://emploi.ouest-france.fr/robots.txt
Redirect Domain emploi.ouest-france.fr
Redirect Base ouest-france.fr
Domain IPs 23.52.171.226, 23.52.171.235, 2600:1417:3f::b81c:eb52, 2600:1417:3f::b81c:eb61
Redirect IPs 104.69.163.117, 2600:1417:3f:7aa::30db, 2600:1417:3f:7ab::30db
Response IP 23.41.79.18
Found Yes
Hash fd3d2b6da358bdf56d8a6ef785384cfd645be62f177f5ada2b74cb4671ba6add
SimHash 641c158127f1

Groups

adsbot-google
adsbot-google-mobile
applebot
bingbot
duckduckbot
facebookexternalhit
facebot
googlebot
googlebot-image
googlebot-mobile
googlebot-news
storebot-google
apis-google
linkedinbot
mediapartners-google
oncrawl
orangebot
orangebot-collector
pinterest
qwantify
searchmetricsbot
semrushbot
slurp
taboolabot
twitterbot
twitterbot
oncrawl
facebot
twitterbot
facebookexternalhit
pinterest
linkedinbot
semrushbot

Rule Path
Disallow /*sort%3D*
Disallow /*q%3D*
Disallow /*?recherche=
Disallow /*?entreprise=
Disallow /*?localisation=
Disallow /*?secteur=
Disallow /*%2B*
Disallow *?preview=true
Disallow /*at_*
Disallow /*aggregate
Disallow */candidat/

*

Rule Path
Disallow /

Other Records

Field Value
sitemap https://emploi.ouest-france.fr/sitemap/sitemap.xml

Comments

  • It is forbidden to use web crawlers or other automatic methods to browse this website
  • Crawling/Scraping is only allowed with special permission from Ouest France Multimedia
  • In another way, we forbid to crawl our website using a stolen user agent which does not match your identity
  • Cf. loi godfrain - Nº 88-19 du 5 janvier 1988 article 462-3 et Art.323-2 du code pénal relatifs aux conséquences pénales de l'altération du fonctionnement d'un système informatique
  • You can find robots.txt RFC at http://www.robotstxt.org/norobots-rfc.txt
  • Allowed search engines directives
  • Sitemaps