blogs.mediapart.fr
robots.txt

Robots Exclusion Standard data for blogs.mediapart.fr

Resource Scan

Scan Details

Site Domain blogs.mediapart.fr
Base Domain mediapart.fr
Scan Status Ok
Last Scan2024-10-28T14:55:18+00:00
Next Scan 2024-11-27T14:55:18+00:00

Last Scan

Scanned2024-10-28T14:55:18+00:00
URL https://blogs.mediapart.fr/robots.txt
Domain IPs 151.101.130.132, 151.101.194.132, 151.101.2.132, 151.101.66.132
Response IP 199.232.46.132
Found Yes
Hash 755400691e607350ade6f74129aae9c1a667059644c3ad3b84fe4558a2d67822
SimHash 522482fa0237

Groups

magpie-crawler

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

omgili

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

*

Rule Path
Allow /
Disallow /perso
Disallow /ajax/
Disallow /comment/
Disallow /tools/
Disallow /bookmark/
Disallow */mot-cle/
Disallow /*/commentaires$
Disallow /blog/respect-mag*

Other Records

Field Value
crawl-delay 3

Comments

  • www.robotstxt.org/
  • www.google.com/support/webmasters/bin/answer.py?hl=en&answer=156449
  • Unwanted indexed contents