ricardo.ch
robots.txt

Robots Exclusion Standard data for ricardo.ch

Resource Scan

Scan Details

Site Domain ricardo.ch
Base Domain ricardo.ch
Scan Status Ok
Last Scan2024-11-11T14:02:11+00:00
Next Scan 2024-11-18T14:02:11+00:00

Last Scan

Scanned2024-11-11T14:02:11+00:00
URL https://ricardo.ch/robots.txt
Redirect https://www.ricardo.ch/robots.txt
Redirect Domain www.ricardo.ch
Redirect Base ricardo.ch
Domain IPs 104.18.42.46, 172.64.145.210
Redirect IPs 104.18.42.46, 172.64.145.210
Response IP 104.18.42.46
Found Yes
Hash 291cfb812b7f022031a749f750c5541e4c85ef1a1bae21cca71776caac6cb030
SimHash 355ccb4cabfa

Groups

*

Rule Path
Disallow /en/*
Disallow /en$
Disallow /de/login
Disallow /fr/login
Disallow /it/login
Disallow /de/regist
Disallow /fr/regist
Disallow /it/regist
Disallow /*/bulk/landing
Disallow /*/my-ricardo
Disallow /*/bookkeeping
Disallow /de/profile
Disallow /fr/profile
Disallow /it/profile
Disallow /de/list
Disallow /fr/list
Disallow /it/list
Disallow /*/s/
Disallow /*/b/*?
Disallow /*/c/*/*/
Disallow /*/c/*/
Allow /*/c/*/$
Allow /*/c/*?
Allow /*/c/o/*/
Disallow /en/c/*/$
Disallow /en/c/*?
Disallow /en/c/o/*/
Disallow /*/shop/*/offers/
Disallow /*/shop/*/offers?
Disallow /*/shop/*/offers*?page=*&
Allow /*/shop/*/offers$
Allow /*/shop/*/offers/$
Allow /*/shop/*/offers?page=
Allow /*/shop/*/offers/?page=
Disallow /*/shop/*/ratings*?
Disallow /en/shop/*/offers$
Disallow /en/shop/*/offers/$
Disallow /en/shop/*/offers?page=
Disallow /en/shop/*/offers/?page=
Disallow /api/browser-statistics/
Disallow /api/frontend/search-autocomplete
Disallow /api/frontend/notifications
Disallow /api/frontend/categories/category-bar
Disallow /api/mfa/categories
Allow /api/mfa/categories/*/promo-offers
Disallow /api/mfa/notifications
Disallow /api/mfa/user/save-seller-id
Disallow /px/pv
Disallow /api/listing-form
Disallow /marketplace-spa/api/questions
Disallow /marketplace-spa/api/fee
Disallow /marketplace-spa/api/pdp
Allow /marketplace-spa/api/pdp/youtube
Disallow /dataservice/
Disallow /pages/
Disallow /ajax/
Disallow /viewitem.aspx
Disallow /*feed.xml
Disallow /online-shop/
Disallow /shop/
Disallow /ratings/
Disallow /pages/*/fr.php
Disallow /auk/
Disallow /acheter/
Disallow /kaufen/
Disallow /zubehoer/
Disallow */v/
Disallow */w/
Disallow /cdn-cgi/

psbot
amazonbot
yandex
petalbot
mail.ru_bot
megaindex
baiduspider
yisouspider
bytespider
sogou web spider
sogou inst spider
proximic
admantx
ahrefs
oncrawl
seekport crawler
semrushbot
blexbot
mj12bot
zoombot
linkbot
chatgpt-user
dotbot

Rule Path
Disallow /

adsbot-google-mobile
adsbot-google

Rule Path
Disallow /en/
Disallow /en$
Disallow /de/login
Disallow /fr/login
Disallow /it/login
Disallow /de/regist
Disallow /fr/regist
Disallow /it/regist
Disallow /*/bulk/landing
Disallow /*/my-ricardo
Disallow /*/bookkeeping
Disallow /de/profile
Disallow /fr/profile
Disallow /it/profile
Disallow /de/list
Disallow /fr/list
Disallow /it/list
Disallow /*/s$
Disallow /*/s/$
Disallow /*/s/*?
Disallow /*/b/*?
Allow /*/s/?listing_type=money_guard$
Disallow /*/shop/*/
Disallow /*?*sort=
Disallow /*?*ar=
Disallow /*?*qcn=
Disallow /*?*ignore_dominant=
Disallow /*?*nextOffset
Disallow /*?*next_offset
Disallow /*?*category=
Disallow /cdn-cgi/
Disallow /api/browser-statistics/
Disallow /api/frontend/search-autocomplete
Disallow /api/frontend/notifications
Disallow /api/frontend/categories
Allow /api/mfa/categories/*/promo-offers
Disallow /px/pv
Disallow /marketplace-spa/api/
Disallow /api/mfa/
Disallow /assets/search/
Disallow /api/listing-form
Disallow /static-assets/marketplace-spa/prod/_next/static/chunks/pages/_app-

criteobot/0.1

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 0.2

grapeshot

Rule Path
Disallow

Comments

  • robots.txt for https://www.ricardo.ch/
  • English pages until release
  • Don't try to index the login page, you will get a Cloudflare challenge
  • After login pages
  • SRP
  • BSRP
  • CSRP
  • Sellers
  • Supporting files
  • Legacy
  • Disallow cloudflare /cdn-cgi/ endpoint.
  • See https://developers.cloudflare.com/fundamentals/get-started/reference/cdn-cgi-endpoint/
  • Disallow commercial bots we don't like
  • User agent names for Google AdsBot can be found here: https://support.google.com/webmasters/answer/1061943?hl=en
  • Limit criteo crawler to 5 req/sec max as it generates imaginary-wrapper traffic peaks
  • Allow Oracle Data Cloud Crawler

Warnings

  • 1 invalid line.