irisa.fr
robots.txt

Robots Exclusion Standard data for irisa.fr

Resource Scan

Scan Details

Site Domain irisa.fr
Base Domain irisa.fr
Scan Status Ok
Last Scan2025-07-15T22:25:14+00:00
Next Scan 2025-08-14T22:25:14+00:00

Last Scan

Scanned2025-07-15T22:25:14+00:00
URL https://www.irisa.fr/robots.txt
Domain IPs 131.254.254.107
Response IP 131.254.254.107
Found Yes
Hash 41dd179860c2c43760087d8fc74f22272bb8f1da286562eef8eb33d03ff0295d
SimHash 358f4bd1c9e4

Groups

*

Rule Path
Disallow */base_view
Disallow */bibliography_exportForm
Disallow */default_error_message
Disallow */file_view
Disallow */image_view
Disallow */link_view
Disallow */login_form
Disallow */raweb_view
Disallow */search_form
Disallow */sendto_form
Disallow */switchLanguage
Disallow /atelier/MRTG/
Disallow /autresimages
Disallow /bibli/
Disallow /dea/
Disallow /LIS/demo-area/
Disallow /paris/web/includes/patTemplate/patTemplate/InputFilter/.logs/
Disallow /paris/Biblio/www/Category/.tpl/
Disallow /paris/bibadmin/images/.thumbs/
Disallow /paris/Gamma/IMG/pdf/.debug/
Disallow /paris/Padicotm/download/Ressources/.link/
Disallow /sarima/
Disallow /theses99/
Disallow /vr4i/
Disallow /bunraku/
Disallow /temics/
Disallow /actualites-bis.html
Disallow /historique.html
Disallow /planSite.html
Disallow /actualites.html
Disallow /index.html
Disallow /talkVardi1.html
Disallow /contact.html
Disallow /invit.html
Disallow /talkVardi.html
Disallow /evenement.html
Disallow /localisation.html
Disallow /tutelles.html
Disallow /format-echange-document.html
Disallow /mentionsLeg.html
Disallow /googlebca65901b9014f86.html
Disallow /partenaires.html
Disallow /user
Disallow /doc/
Disallow /DOCS/

ocelli

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 20

ai2bot
ai2bot-dolma
amazonbot
amazonbot/0.1
anthropic-ai
applebot
applebot-extended
brightbot 1.0
bytespider
ccbot
chatgpt-user
claude-web
claudebot
cohere-ai
cohere-training-data-crawler
crawlspace
diffbot
duckassistbot
facebookbot
friendlycrawler
google-extended
googleother
googleother-image
googleother-video
gptbot
iaskspider/2.0
icc-crawler
imagesiftbot
img2dataset
isscyberriskcrawler
kangaroo bot
meta-externalagent
meta-externalfetcher
oai-searchbot
omgili
omgilibot
pangubot
perplexitybot
petalbot
scrapy
semrushbot-ocob
semrushbot-swa
semrushbot*
sidetrade indexer bot
timpibot
velenpublicwebcrawler
webzio-extended
youbot

Rule Path
Disallow /

Comments

  • Fichier indiquant aux m�canismes d'indexation automatique, respectant cette
  • convention, de ne pas parcourir certaines parties de ce serveur
  • DL, le 30/10/96