youscribe.com
robots.txt

Robots Exclusion Standard data for youscribe.com

Resource Scan

Scan Details

Site Domain youscribe.com
Base Domain youscribe.com
Scan Status Failed
Failure StageFetching resource.
Failure ReasonServer returned a server error.
Last Scan2024-10-06T21:08:16+00:00
Next Scan 2024-12-05T21:08:16+00:00

Last Successful Scan

Scanned2024-07-16T21:06:40+00:00
URL https://youscribe.com/robots.txt
Redirect https://www.youscribe.com/robots.txt
Redirect Domain www.youscribe.com
Redirect Base youscribe.com
Domain IPs 35.186.237.217
Redirect IPs 35.186.237.217
Response IP 35.186.237.217
Found Yes
Hash 0d32bb4888ca60ba00ed9504251dc00437bd8fd47b62d47a39819f91a165c63a
SimHash e6d3691cefd7

Groups

mediapartners-google

Rule Path
Disallow /search?*
Disallow /search/*
Disallow /Product/Download/
Disallow /Account/
Disallow /Cart
Disallow /Cart/
Disallow /Product/DownloadFile/

*

Rule Path
Disallow /*/publications/*excluded_language_id*
Disallow /*/publications/*excluded_theme_id*
Disallow /*/publications/*not_in_theme_id*
Disallow /Account/
Disallow /Account/AddToLibrary/
Disallow /Account/LoadReplies/
Disallow /BnFService
Disallow /BookReader/
Disallow /BookReader/Index*
Disallow /BookReader/CanSearch*
Disallow /BookReader/Embed*
Disallow /BookReader/EmbedJs*
Disallow /BookReader/EmbedPreview*
Disallow /BookReader/FontCss*
Disallow /BookReader/HtmlPage*
Disallow /BookReader/IframeEmbed*
Disallow /BookReader/Info*
Disallow /BookReader/MainTemplate*
Disallow /BookReader/Page*
Disallow /BookReader/Print*
Disallow /BookReader/RatioImage*
Disallow /BookReader/Search*
Disallow /BookReader/AudioReader*
Disallow /Cart
Disallow /Cart/
Disallow /Catalog/Certify*
Disallow /Catalog/LoadProducts*
Disallow /Catalog/Store*access_type%3D*
Disallow /Catalog/Store*category_id%3D*
Disallow /Catalog/Store*is_free%3D*
Disallow /Catalog/Store*is_french_domain%3D*
Disallow /Catalog/Store*language_id%3D*
Disallow /Catalog/Store*sort%3D*
Disallow /Catalog/Store*tag_id%3D*
Disallow /Catalog/Store*theme_id%3D*
Disallow /catalogue/*access_type%3D*
Disallow /catalogue/*category_id%3D*
Disallow /catalogue/*is_free%3D*
Disallow /catalogue/*is_french_domain%3D*
Disallow /catalogue/*language_id%3D*
Disallow /catalogue/*price_group%3D*
Disallow /catalogue/*sort%3D*
Disallow /catalogue/*tag_id%3D*
Disallow /catalogue/*theme_id%3D*
Disallow /DataCapture/
Disallow /Error
Disallow /Error/
Disallow /Facebook/
Disallow /Feedback/
Disallow /FlashReader/Page
Disallow /Image/
Disallow /page/*access_type%3D*
Disallow /page/*category_id%3D*
Disallow /page/*is_free%3D*
Disallow /page/*is_french_domain%3D*
Disallow /page/*language_id%3D*
Disallow /page/*price_group%3D*
Disallow /page/*sort%3D*
Disallow /page/*tag_id%3D*
Disallow /page/*theme_id%3D*
Disallow /page/products/
Disallow /Premium/LeadSharingProcess/
Disallow /Premium/Payment/
Disallow /Product/Certify/
Disallow /Product/Download/
Disallow /Product/DownloadFile/
Disallow /Product/LoadCommentReplies*
Disallow /Product/LoadLikeProducts/
Disallow /Product/LoadRelatedProducts/
Disallow /Product/PublishComplete/
Disallow /Product/Report/
Disallow /Product/ReportInfraction/
Disallow /Product/Share/
Disallow /ProductPage/LoadCommentReplies*
Disallow /ProductPage/ReplyComment/
Disallow /Public/GetLibraryProducts/
Disallow /Search
Disallow /Search/
Disallow /Stat/ViewProduct/
Disallow /Static/ConcoursPubliezEtGagnez
Disallow /Static/Js/
Disallow /Static/MeteoJob/
Disallow /Static/Quiz/*
Disallow /Static/RedirectAction
Disallow /Static/Tracking
Disallow /Subscription/*
Disallow /tag/*access_type%3D*
Disallow /tag/*category_id%3D*
Disallow /tag/*is_free%3D*
Disallow /tag/*is_french_domain%3D*
Disallow /tag/*language_id%3D*
Disallow /tag/*page%3D*
Disallow /tag/*sort%3D*
Disallow /tag/*tag_id%3D*
Disallow /tag/*theme_id%3D*
Disallow /*?ni=1
Disallow /communaute/
Disallow /communaute/*
Disallow /plan/communaute/
Disallow /plan/communaute/*

ubicrawler

Rule Path
Disallow /

doc

Rule Path
Disallow /

zao

Rule Path
Disallow /

sitecheck.internetseer.com

Rule Path
Disallow /

zealbot

Rule Path
Disallow /

msiecrawler

Rule Path
Disallow /

sitesnagger

Rule Path
Disallow /

webstripper

Rule Path
Disallow /

webcopier

Rule Path
Disallow /

fetch

Rule Path
Disallow /

offline explorer

Rule Path
Disallow /

teleport

Rule Path
Disallow /

teleportpro

Rule Path
Disallow /

webzip

Rule Path
Disallow /

linko

Rule Path
Disallow /

httrack

Rule Path
Disallow /

microsoft.url.control

Rule Path
Disallow /

xenu

Rule Path
Disallow /

larbin

Rule Path
Disallow /

libwww

Rule Path
Disallow /

zyborg

Rule Path
Disallow /

download ninja

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

shopwiki

Rule Path
Disallow /

wget

Rule Path
Disallow /

grub-client

Rule Path
Disallow /

k2spider

Rule Path
Disallow /

npbot

Rule Path
Disallow /

webreaper

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.youscribe.com/sitemaps/ys_fr_catalog_audiobooks.xml
sitemap https://www.youscribe.com/sitemaps/ys_fr_catalog_bd.xml
sitemap https://www.youscribe.com/sitemaps/ys_fr_catalog_documents.xml
sitemap https://www.youscribe.com/sitemaps/ys_fr_catalog_documents_scolaires.xml
sitemap https://www.youscribe.com/sitemaps/ys_fr_catalog_ebooks.xml
sitemap https://www.youscribe.com/sitemaps/ys_fr_catalog_partitions.xml
sitemap https://www.youscribe.com/sitemaps/ys_fr_catalog_presse.xml

Comments

  • Sorry, wget in its recursive mode is a frequent problem.
  • Please read the man page and use it properly; there is a
  • --wait option you can use to set the delay between hits,
  • for instance.
  • The 'grub' distributed client has been *very* poorly behaved.
  • Doesn't follow robots.txt anyway, but...
  • Hits many times per second, not acceptable
  • http://www.nameprotect.com/botinfo.html
  • A capture bot, downloads gazillions of pages with no public benefit
  • http://www.webreaper.net/