clubxv.com
robots.txt

Robots Exclusion Standard data for clubxv.com

Resource Scan

Scan Details

Site Domain clubxv.com
Base Domain clubxv.com
Scan Status Failed
Failure StageFetching resource.
Failure ReasonCouldn't connect to server.
Last Scan2024-09-25T14:56:43+00:00
Next Scan 2024-10-25T14:56:43+00:00

Last Successful Scan

Scanned2024-08-27T05:55:46+00:00
URL https://clubxv.com/robots.txt
Domain IPs 84.18.212.98
Response IP 84.18.212.98
Found Yes
Hash 5a9054577e51dfb017a93c0c0af35db31656e6f12cf6d70864185d281227d3aa
SimHash e9740d1d3cee

Groups

barkrowler

Rule Path
Disallow /

domaincrawler/3.0

Rule Path
Disallow /

spbot

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

alphaseobot

Rule Path
Disallow /

bubing

Rule Path
Disallow /

cliqzbot

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

linguee

Rule Path
Disallow /

linkdexbot/2.0

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

seznambot

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

screaming frog seo spider

Rule Path
Disallow /

Comments

  • ABOUT THIS FILE
  • -------------------------------------------------------------------
  • Version: 0.2.2
  • Updated: 27/06/2019
  • Source: https://gitlab.com/beepmode/robotctl
  • This robots.txt file contains robots that have been observed to hit
  • servers hard _and_ which serve little to no purpose. Most of these
  • bots appear to respect robots.txt files. Bots that appear to ignore
  • robots.txt files are listed seperately.
  • Note that this file doesn't aim to be an exhaustive list of nastly
  • bots. Rather, it is based on bots I am seeing on my own servers.
  • If you don't mind a robots.txt file that is over 3,500 lines long
  • then you could use this file instead:
  • https://github.com/mitchellkrogza/apache-ultimate-bad-bot-blocker/blob/master/robots.txt/robots.txt
  • NAUGHTY BOTS THAT MAY IGNORE THIS FILE
  • -------------------------------------------------------------------
  • The following are bots that appear to ignore robots.txt files. You
  • may want to block these via a ModSec rule instead.
  • Bot: Barkrowler (http://www.exensa.com/crawl)
  • Description: Vague bot from a vague company that doesn't provide
  • robots.txt info.
  • Bot: DomainCrawler (http://www.domaincrawler.com)
  • Description: Useless, outdated bot that appears to disregard
  • robots.txt files.
  • Bot: Spbot (http://openLinkprofiler.org/bot)
  • Description: Yet another useless SEO bot. No longer provides
  • robots.txt information.
  • BOTS THAT WILL RESPECT THIS FILE
  • -------------------------------------------------------------------
  • Bot: AhrefsBot (https://ahrefs.com/robot)
  • Description: Yet another useless SEO bot.
  • Bot: AlphaBot/3.2; (http://alphaseobot.com/bot.html)
  • Description: Yet another useless SEO bot.
  • Bot: BubiNG (http://law.di.unimi.it/BUbiNG.html)
  • Description: vague, open source bot.
  • Bot: Cliqzbot (https://cliqz.com/en/cliqzbot)
  • Description: Proprietary bot by small company that lets you
  • perform searches in your www browser (as opposed to using the
  • Yellow Pages?).
  • Bot: Dotbot (http://www.opensiteexplorer.org/dotbot)
  • Description: Useless SEO bot that can hit websites hard.
  • Bot: Linguee
  • Description: Bot used for a proprietary translation app. It's
  • not clear why the bot lives and it has been caught hitting
  • websites hard.
  • Bot: linkdexbot (http://www.linkdex.com/bots/)
  • Description: Yet another useless SEO bot.
  • Bot: MJ12bot (http://mj12bot.com)
  • Description: Bot that wants to understand and paint a map or the
  • internet.
  • Bot: https://www.semrush.com/bot/
  • Description: Yet another useless SEO bot.
  • Bot: SeznamBot (http://napoveda.seznam.cz/en/seznambot-intro/)
  • Description: Czech search engine. Nothing wrong with that but the
  • bot can trigger a huge number of hits.
  • Bot: TurnitinBot (https://turnitin.com/robot/crawlerinfo.html)
  • Description: proprietary, commercial anti-plagerism bot.
  • Bot: Screaming Frog SEO Spider (https://www.screamingfrog.co.uk/seo-spider/faq/)
  • Description: Yet another SEO bot.