superxv.tv
robots.txt

Robots Exclusion Standard data for superxv.tv

Resource Scan

Scan Details

Site Domain superxv.tv
Base Domain superxv.tv
Scan Status Failed
Failure StageFetching resource.
Failure ReasonCouldn't connect to server.
Last Scan2024-09-24T19:17:50+00:00
Next Scan 2024-10-24T19:17:50+00:00

Last Successful Scan

Scanned2024-08-26T18:40:19+00:00
URL https://superxv.tv/robots.txt
Domain IPs 84.18.212.98
Response IP 84.18.212.98
Found Yes
Hash 5a9054577e51dfb017a93c0c0af35db31656e6f12cf6d70864185d281227d3aa
SimHash e9740d1d3cee

Groups

barkrowler

Rule Path
Disallow /

domaincrawler/3.0

Rule Path
Disallow /

spbot

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

alphaseobot

Rule Path
Disallow /

bubing

Rule Path
Disallow /

cliqzbot

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

linguee

Rule Path
Disallow /

linkdexbot/2.0

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

seznambot

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

screaming frog seo spider

Rule Path
Disallow /

Comments

  • ABOUT THIS FILE
  • -------------------------------------------------------------------
  • Version: 0.2.2
  • Updated: 27/06/2019
  • Source: https://gitlab.com/beepmode/robotctl
  • This robots.txt file contains robots that have been observed to hit
  • servers hard _and_ which serve little to no purpose. Most of these
  • bots appear to respect robots.txt files. Bots that appear to ignore
  • robots.txt files are listed seperately.
  • Note that this file doesn't aim to be an exhaustive list of nastly
  • bots. Rather, it is based on bots I am seeing on my own servers.
  • If you don't mind a robots.txt file that is over 3,500 lines long
  • then you could use this file instead:
  • https://github.com/mitchellkrogza/apache-ultimate-bad-bot-blocker/blob/master/robots.txt/robots.txt
  • NAUGHTY BOTS THAT MAY IGNORE THIS FILE
  • -------------------------------------------------------------------
  • The following are bots that appear to ignore robots.txt files. You
  • may want to block these via a ModSec rule instead.
  • Bot: Barkrowler (http://www.exensa.com/crawl)
  • Description: Vague bot from a vague company that doesn't provide
  • robots.txt info.
  • Bot: DomainCrawler (http://www.domaincrawler.com)
  • Description: Useless, outdated bot that appears to disregard
  • robots.txt files.
  • Bot: Spbot (http://openLinkprofiler.org/bot)
  • Description: Yet another useless SEO bot. No longer provides
  • robots.txt information.
  • BOTS THAT WILL RESPECT THIS FILE
  • -------------------------------------------------------------------
  • Bot: AhrefsBot (https://ahrefs.com/robot)
  • Description: Yet another useless SEO bot.
  • Bot: AlphaBot/3.2; (http://alphaseobot.com/bot.html)
  • Description: Yet another useless SEO bot.
  • Bot: BubiNG (http://law.di.unimi.it/BUbiNG.html)
  • Description: vague, open source bot.
  • Bot: Cliqzbot (https://cliqz.com/en/cliqzbot)
  • Description: Proprietary bot by small company that lets you
  • perform searches in your www browser (as opposed to using the
  • Yellow Pages?).
  • Bot: Dotbot (http://www.opensiteexplorer.org/dotbot)
  • Description: Useless SEO bot that can hit websites hard.
  • Bot: Linguee
  • Description: Bot used for a proprietary translation app. It's
  • not clear why the bot lives and it has been caught hitting
  • websites hard.
  • Bot: linkdexbot (http://www.linkdex.com/bots/)
  • Description: Yet another useless SEO bot.
  • Bot: MJ12bot (http://mj12bot.com)
  • Description: Bot that wants to understand and paint a map or the
  • internet.
  • Bot: https://www.semrush.com/bot/
  • Description: Yet another useless SEO bot.
  • Bot: SeznamBot (http://napoveda.seznam.cz/en/seznambot-intro/)
  • Description: Czech search engine. Nothing wrong with that but the
  • bot can trigger a huge number of hits.
  • Bot: TurnitinBot (https://turnitin.com/robot/crawlerinfo.html)
  • Description: proprietary, commercial anti-plagerism bot.
  • Bot: Screaming Frog SEO Spider (https://www.screamingfrog.co.uk/seo-spider/faq/)
  • Description: Yet another SEO bot.