arte-tv.com
robots.txt

Robots Exclusion Standard data for arte-tv.com

Resource Scan

Scan Details

Site Domain arte-tv.com
Base Domain arte-tv.com
Scan Status Ok
Last Scan2024-06-21T15:11:26+00:00
Next Scan 2024-06-28T15:11:26+00:00

Last Scan

Scanned2024-06-21T15:11:26+00:00
URL https://arte-tv.com/robots.txt
Redirect https://www.arte.tv/robots.txt
Redirect Domain www.arte.tv
Redirect Base arte.tv
Domain IPs 212.95.72.42
Redirect IPs 104.68.72.117, 2a02:26f0:e200:286::1b8c, 2a02:26f0:e200:2b6::1b8c
Response IP 184.24.26.170
Found Yes
Hash 531ce9554a79b331cd9810ec0dc83c1b3e7c2ec7cf4485f382fa8dbaf14d9e00
SimHash a697519286f1

Groups

dotbot

Rule Path
Disallow /

fasterfox

Rule Path
Disallow /

wisebot

Rule Path
Disallow /

converacrawler

Rule Path
Disallow /

scrubby

Rule Path
Disallow /

robozilla

Rule Path
Disallow /

nutch

Rule Path
Disallow /

psbot

Rule Path
Disallow /

asterias

Rule Path
Disallow /

cscrawler

Rule Path
Disallow /

larbin

Rule Path
Disallow /

sproose

Rule Path
Disallow /

naverbot

Rule Path
Disallow /

ubicrawler

Rule Path
Disallow /

doc

Rule Path
Disallow /

zao

Rule Path
Disallow /

sitecheck.internetseer.com

Rule Path
Disallow /

zealbot

Rule Path
Disallow /

msiecrawler

Rule Path
Disallow /

sitesnagger

Rule Path
Disallow /

webstripper

Rule Path
Disallow /

webcopier

Rule Path
Disallow /

fetch

Rule Path
Disallow /

offline explorer

Rule Path
Disallow /

teleport

Rule Path
Disallow /

teleportpro

Rule Path
Disallow /

webzip

Rule Path
Disallow /

linko

Rule Path
Disallow /

httrack

Rule Path
Disallow /

microsoft.url.control

Rule Path
Disallow /

libwww

Rule Path
Disallow /

zyborg

Rule Path
Disallow /

download ninja

Rule Path
Disallow /

wget

Rule Path
Disallow /

grub-client

Rule Path
Disallow /

k2spider

Rule Path
Disallow /

npbot

Rule Path
Disallow /

webreaper

Rule Path
Disallow /

reloado.com searchengine

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

fiddler

Rule Path
Disallow /

cloudtv

Rule Path
Disallow /

buck

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

ia_archiver

Rule Path
Disallow /

ia_archiver-web.archive.org

Rule Path
Disallow /

twitterbot
facebookexternalhit

Rule Path
Disallow /admin/
Disallow /api/mami/
Disallow /cgi-bin/
Disallow /club-preprod/
Disallow /digital-preprod/
Disallow /hbbtvv2-preprod/
Disallow /magazine-dev/
Disallow /magazine-test/
Disallow /player-preprod/
Disallow /question-preprod/
Disallow /replay-preprod/
Disallow /resize-preprod/
Disallow /resize/
Disallow /search/
Disallow /search?q=
Disallow /sitespreprod/
Disallow /static-preprod/
Disallow /static/Tsunami/
Disallow /static/c0/
Disallow /static/c1/
Disallow /static/c2/
Disallow /static/c3/
Disallow /static/c4/
Disallow /static/c5/
Disallow /static/c6/
Disallow /static/c7/
Disallow /static/plokker/
Disallow /static/u1/
Disallow /static/u2/
Disallow /static/u3/
Disallow /static/u4/
Disallow /static/u5/
Disallow /static/u6/

Other Records

Field Value
crawl-delay 42

googlebot
bingbot
yandex
qwantify
duckduckbot

Rule Path
Disallow /admin/
Disallow /cgi-bin/
Disallow /club-preprod/
Disallow /digital-preprod/
Disallow /hbbtvv2-preprod/
Disallow /magazine-dev/
Disallow /magazine-test/
Disallow /player-preprod/
Disallow /question-preprod/
Disallow /replay-preprod/
Disallow /resize-preprod/
Disallow /search/
Disallow /search?q=
Disallow /sitespreprod/
Disallow /static-preprod/
Disallow /static/Tsunami/
Disallow /static/c0/
Disallow /static/c1/
Disallow /static/c2/
Disallow /static/c3/
Disallow /static/c4/
Disallow /static/c5/
Disallow /static/c6/
Disallow /static/c7/
Disallow /static/plokker/
Disallow /static/u1/
Disallow /static/u2/
Disallow /static/u3/
Disallow /static/u4/
Disallow /static/u5/
Disallow /static/u6/

Other Records

Field Value
crawl-delay 2

*

Rule Path
Disallow /admin/
Disallow /api/mami/
Disallow /cgi-bin/
Disallow /club-preprod/
Disallow /digital-preprod/
Disallow /hbbtvv2-preprod/
Disallow /magazine-dev/
Disallow /magazine-test/
Disallow /player-preprod/
Disallow /question-preprod/
Disallow /replay-preprod/
Disallow /resize-preprod/
Disallow /resize/
Disallow /search/
Disallow /search?q=
Disallow /sitespreprod/
Disallow /static-preprod/
Disallow /static/Tsunami/
Disallow /static/c0/
Disallow /static/c1/
Disallow /static/c2/
Disallow /static/c3/
Disallow /static/c4/
Disallow /static/c5/
Disallow /static/c6/
Disallow /static/c7/
Disallow /static/plokker/
Disallow /static/u1/
Disallow /static/u2/
Disallow /static/u3/
Disallow /static/u4/
Disallow /static/u5/
Disallow /static/u6/

Other Records

Field Value
crawl-delay 42

Other Records

Field Value
sitemap https://www.arte.tv/sitemap.xml
sitemap https://www.arte.tv/sitemap.xml
sitemap https://www.arte.tv/sitemap.xml

Comments

  • This file is now maintained in Github
  • Version: v6.287.5907
  • Bad bots
  • IA_archiver
  • Zaphod bots
  • Good bots
  • Ugly bots and manically depressed robots deserve a long crawl delay