mediaclip.ina.fr
robots.txt

Robots Exclusion Standard data for mediaclip.ina.fr

Resource Scan

Scan Details

Site Domain mediaclip.ina.fr
Base Domain ina.fr
Scan Status Ok
Last Scan2025-05-25T17:21:59+00:00
Next Scan 2025-06-24T17:21:59+00:00

Last Scan

Scanned2025-05-25T17:21:59+00:00
URL https://mediaclip.ina.fr/robots.txt
Domain IPs 194.127.250.17
Response IP 194.127.250.17
Found Yes
Hash 5af1e52f49be2678ebef88dbed962e9df7d782c4b096b5f861649d0559b1ae3f
SimHash 773ec909c2e6

Groups

ai2bot
ai2bot-dolma
aihitbot
amazonbot
anthropic-ai
applebot
applebot-extended
brightbot 1.0
bytespider
ccbot
chatgpt-user
claude-web
claudebot
cohere-ai
cohere-training-data-crawler
cotoyogi
crawlspace
diffbot
duckassistbot
facebookbot
factset_spyderbot
firecrawlagent
friendlycrawler
google-extended
googleother
googleother-image
googleother-video
gptbot
iaskspider/2.0
icc-crawler
imagesiftbot
img2dataset
imgproxy
isscyberriskcrawler
kangaroo bot
meta-externalagent
meta-externalagent
meta-externalfetcher
meta-externalfetcher
novaact
oai-searchbot
omgili
omgilibot
operator
pangubot
perplexity-user
perplexitybot
petalbot
scrapy
semrushbot-ocob
semrushbot-swa
sidetrade indexer bot
tiktokspider
timpibot
velenpublicwebcrawler
webzio-extended
youbot

Rule Path
Disallow /

*

Rule Path
Disallow /?
Disallow /*.php$
Disallow /checkout/
Disallow /app/
Disallow /lib/
Disallow /catalogsearch/
Disallow /search/
Disallow /*?PHPSESSID=
Disallow /pkginfo/
Disallow /report/
Disallow /var/
Disallow /catalog/
Disallow /customer/
Disallow /sendfriend/
Disallow /review/
Disallow /*SID%3D
Disallow /store/
Disallow /stores/
Disallow /*result/?q=
Disallow /*customer/
Disallow /*inacatalog/product/download/product/
Disallow /*stores/store/redirect/

Other Records

Field Value
crawl-delay 10

Other Records

Field Value
sitemap https://mediaclip.ina.fr/sitemaps/sitemap.xml

Comments

  • robots.txt
  • Bots IA
  • SEO