mediapart.fr
robots.txt

Robots Exclusion Standard data for mediapart.fr

Resource Scan

Scan Details

Site Domain mediapart.fr
Base Domain mediapart.fr
Scan Status Ok
Last Scan2024-11-06T18:16:50+00:00
Next Scan 2024-11-20T18:16:50+00:00

Last Scan

Scanned2024-11-06T18:16:50+00:00
URL https://mediapart.fr/robots.txt
Redirect https://www.mediapart.fr/robots.txt
Redirect Domain www.mediapart.fr
Redirect Base mediapart.fr
Domain IPs 151.101.130.132, 151.101.194.132, 151.101.2.132, 151.101.66.132
Redirect IPs 151.101.130.132, 151.101.194.132, 151.101.2.132, 151.101.66.132
Response IP 199.232.46.132
Found Yes
Hash b51bf1e007f63f74a5ce48f029de8eac8cf5b8cca74f19b7e87be9a4c1c7ff68
SimHash 5a648cfd6f36

Groups

magpie-crawler

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

omgili

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

*

Rule Path
Allow /
Allow /abo/offres
Disallow /abo/*
Disallow /offrir_article/
Disallow /article/offert/
Disallow /ajax/
Disallow /comment/
Disallow /tools/
Disallow /bookmark/
Disallow /petition.php
Disallow /search?search_word=*
Disallow /journal/mot-cle/
Disallow /*/commentaires$
Disallow /journal/fil-dactualites/*08/*
Disallow /journal/fil-dactualites/*09/*
Disallow /journal/fil-dactualites/*10/*
Disallow /journal/fil-dactualites/*11/*
Disallow /journal/fil-dactualites/*12/*
Disallow /journal/fil-dactualites/*13/*
Disallow /journal/fil-dactualites/*14/*
Disallow /journal/fil-dactualites/*15/*
Disallow /journal/fil-dactualites/*16/*
Disallow /journal/fil-dactualites/*17/*
Disallow /journal/fil-dactualites/*18/*
Disallow /journal/fil-dactualites/*19/*
Disallow /journal/fil-dactualites/*20/*
Disallow /journal/fil-dactualites/*21/*
Disallow /journal/fil-dactualites/*22/*

Other Records

Field Value
crawl-delay 3

Other Records

Field Value
sitemap https://www.mediapart.fr/sitemap_index.xml
sitemap https://www.mediapart.fr/news_sitemap_editor_choice.xml

Comments

  • www.robotstxt.org/
  • www.google.com/support/webmasters/bin/answer.py?hl=en&answer=156449