arte.fr
robots.txt

Robots Exclusion Standard data for arte.fr

Resource Scan

Scan Details

Site Domain arte.fr
Base Domain arte.fr
Scan Status Ok
Last Scan2024-09-16T23:30:51+00:00
Next Scan 2024-09-23T23:30:51+00:00

Last Scan

Scanned2024-09-16T23:30:51+00:00
URL https://arte.fr/robots.txt
Redirect https://www.arte.tv/robots.txt
Redirect Domain www.arte.tv
Redirect Base arte.tv
Domain IPs 212.95.72.42
Redirect IPs 104.120.110.172, 2a02:26f0:9c00:19d::1b8c, 2a02:26f0:9c00:1a7::1b8c
Response IP 23.51.111.91
Found Yes
Hash 2f597273913bf40b30e58d459d26a6ab209dd902c36c8efe4d38fda182696885
SimHash a697519284f1

Groups

dotbot

Rule Path
Disallow /

fasterfox

Rule Path
Disallow /

wisebot

Rule Path
Disallow /

converacrawler

Rule Path
Disallow /

scrubby

Rule Path
Disallow /

robozilla

Rule Path
Disallow /

nutch

Rule Path
Disallow /

psbot

Rule Path
Disallow /

asterias

Rule Path
Disallow /

cscrawler

Rule Path
Disallow /

larbin

Rule Path
Disallow /

sproose

Rule Path
Disallow /

naverbot

Rule Path
Disallow /

ubicrawler

Rule Path
Disallow /

doc

Rule Path
Disallow /

zao

Rule Path
Disallow /

sitecheck.internetseer.com

Rule Path
Disallow /

zealbot

Rule Path
Disallow /

msiecrawler

Rule Path
Disallow /

sitesnagger

Rule Path
Disallow /

webstripper

Rule Path
Disallow /

webcopier

Rule Path
Disallow /

fetch

Rule Path
Disallow /

offline explorer

Rule Path
Disallow /

teleport

Rule Path
Disallow /

teleportpro

Rule Path
Disallow /

webzip

Rule Path
Disallow /

linko

Rule Path
Disallow /

httrack

Rule Path
Disallow /

microsoft.url.control

Rule Path
Disallow /

libwww

Rule Path
Disallow /

zyborg

Rule Path
Disallow /

download ninja

Rule Path
Disallow /

wget

Rule Path
Disallow /

grub-client

Rule Path
Disallow /

k2spider

Rule Path
Disallow /

npbot

Rule Path
Disallow /

webreaper

Rule Path
Disallow /

reloado.com searchengine

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

fiddler

Rule Path
Disallow /

cloudtv

Rule Path
Disallow /

buck

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

ia_archiver

Rule Path
Disallow /

ia_archiver-web.archive.org

Rule Path
Disallow /

twitterbot
facebookexternalhit

Rule Path
Disallow /admin/
Disallow /api/mami/
Disallow /cgi-bin/
Disallow /club-preprod/
Disallow /digital-preprod/
Disallow /hbbtvv2-preprod/
Disallow /magazine-dev/
Disallow /magazine-test/
Disallow /player-preprod/
Disallow /question-preprod/
Disallow /replay-preprod/
Disallow /resize-preprod/
Disallow /resize/
Disallow /search/
Disallow /search?q=
Disallow /sitespreprod/
Disallow /static-preprod/
Disallow /static/Tsunami/
Disallow /static/c0/
Disallow /static/c1/
Disallow /static/c2/
Disallow /static/c3/
Disallow /static/c4/
Disallow /static/c5/
Disallow /static/c6/
Disallow /static/c7/
Disallow /static/plokker/
Disallow /static/u1/
Disallow /static/u2/
Disallow /static/u3/
Disallow /static/u4/
Disallow /static/u5/
Disallow /static/u6/

Other Records

Field Value
crawl-delay 42

googlebot
bingbot
yandex
qwantify
duckduckbot

Rule Path
Disallow /admin/
Disallow /cgi-bin/
Disallow /club-preprod/
Disallow /digital-preprod/
Disallow /hbbtvv2-preprod/
Disallow /magazine-dev/
Disallow /magazine-test/
Disallow /player-preprod/
Disallow /question-preprod/
Disallow /replay-preprod/
Disallow /resize-preprod/
Disallow /search/
Disallow /search?q=
Disallow /sitespreprod/
Disallow /static-preprod/
Disallow /static/Tsunami/
Disallow /static/c0/
Disallow /static/c1/
Disallow /static/c2/
Disallow /static/c3/
Disallow /static/c4/
Disallow /static/c5/
Disallow /static/c6/
Disallow /static/c7/
Disallow /static/plokker/
Disallow /static/u1/
Disallow /static/u2/
Disallow /static/u3/
Disallow /static/u4/
Disallow /static/u5/
Disallow /static/u6/

Other Records

Field Value
crawl-delay 2

*

Rule Path
Disallow /admin/
Disallow /api/mami/
Disallow /cgi-bin/
Disallow /club-preprod/
Disallow /digital-preprod/
Disallow /hbbtvv2-preprod/
Disallow /magazine-dev/
Disallow /magazine-test/
Disallow /player-preprod/
Disallow /question-preprod/
Disallow /replay-preprod/
Disallow /resize-preprod/
Disallow /resize/
Disallow /search/
Disallow /search?q=
Disallow /sitespreprod/
Disallow /static-preprod/
Disallow /static/Tsunami/
Disallow /static/c0/
Disallow /static/c1/
Disallow /static/c2/
Disallow /static/c3/
Disallow /static/c4/
Disallow /static/c5/
Disallow /static/c6/
Disallow /static/c7/
Disallow /static/plokker/
Disallow /static/u1/
Disallow /static/u2/
Disallow /static/u3/
Disallow /static/u4/
Disallow /static/u5/
Disallow /static/u6/

Other Records

Field Value
crawl-delay 42

Other Records

Field Value
sitemap https://www.arte.tv/sitemap.xml
sitemap https://www.arte.tv/sitemap.xml
sitemap https://www.arte.tv/sitemap.xml

Comments

  • This file is now maintained in Github
  • Version: v6.308.6164
  • Bad bots
  • IA_archiver
  • Zaphod bots
  • Good bots
  • Ugly bots and manically depressed robots deserve a long crawl delay