checkwebsitetools.com
robots.txt

Robots Exclusion Standard data for checkwebsitetools.com

Resource Scan

Scan Details

Site Domain checkwebsitetools.com
Base Domain checkwebsitetools.com
Scan Status Ok
Last Scan2024-11-16T17:31:41+00:00
Next Scan 2024-11-23T17:31:41+00:00

Last Scan

Scanned2024-11-16T17:31:41+00:00
URL https://checkwebsitetools.com/robots.txt
Domain IPs 162.241.252.221
Response IP 162.241.252.221
Found Yes
Hash 9b0856128f5994ab3e3ea44b8fd07e825d3fee63a62b9019251358cd4a22c1b2
SimHash 7a5c496b85a0

Groups

adbeat_bot

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

alphaseobot

Rule Path
Disallow /

alphaseobot-sa

Rule Path
Disallow /

archive.org_bot

Rule Path
Disallow /

barkrowler

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

bubing

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

cliqzbot

Rule Path
Disallow /

coccocbot-web

Rule Path
Disallow /

deusu

Rule Path
Disallow /

domainstatsbot

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

exabot

Rule Path
Disallow /

extlinksbot

Rule Path
Disallow /

getintent crawler

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

hyscore

Rule Path
Disallow /

httrack

Rule Path
Disallow /

ias_crawler

Rule Path
Disallow /

jamesbot

Rule Path
Disallow /

leikibot

Rule Path
Disallow /

mail.ru

Rule Path
Disallow /

mappy

Rule Path
Disallow /

mauibot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

mojeekbot

Rule Path
Disallow /

netseer

Rule Path
Disallow /

obot

Rule Path
Disallow /

petalbot

Rule Path
Disallow /

safednsbot

Rule Path
Disallow /

scoutjet

Rule Path
Disallow /

seeker

Rule Path
Disallow /

seekportbot

Rule Path
Disallow /

seokicks-robot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

semrushbot-sa

Rule Path
Disallow /

seznambot

Rule Path
Disallow /

siteexplorer

Rule Path
Disallow /

smtbot

Rule Path
Disallow /

spbot

Rule Path
Disallow /

surdotlybot

Rule Path
Disallow /

trendictionbot

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

veoozbot

Rule Path
Disallow /

virusdie crawler

Rule Path
Disallow /

wotbox

Rule Path
Disallow /

yandex

Rule Path
Disallow /

zoombot

Rule Path
Disallow /

*

Rule Path
Allow /

Comments

  • hxxps://www.adbeat.com/operation_policy
  • hxxps://ahrefs.com/robot
  • hxxp://alphaseobot.com/bot.html
  • hxxps://archive.org/details/archive.org_bot
  • hxxp://webmeup-crawler.com/
  • hxxp://law.di.unimi.it/BUbiNG.html
  • hxxp://commoncrawl.org/big-picture/frequently-asked-questions/
  • User-agent: Clickagy Intelligence Bot
  • Disallow: /
  • User-agent: Clickagy Intelligence Bot v2
  • Disallow: /
  • hxxps://cliqz.com/en/cliqzbot
  • hxxp://help.coccoc.com/en/search-engine
  • hxxps://deusu.de/robot.html
  • hxxp://domainstats.io/our-bot
  • hxxp://moz.com/products
  • hxxp://www.opensiteexplorer.org/dotbot
  • hxxps://www.exalead.com/search/webmasterguide
  • hxxps://extlinks.com/Bot.html
  • hxxps://getintent.com/bot.html
  • hxxp://www.grapeshot.com/crawler/
  • User-agent: grapeshot
  • Disallow: /
  • hxxps://hyscore.io/crawler/
  • hxxps://integralads.com/site-indexing-policy/
  • hxxps://cognitiveseo.com/bot.html
  • hxxps://help.mail.ru/webmaster/indexing/robots.txt/rules/user-agent
  • hxxp://mappydata.net/
  • hxxp://mj12bot.com/
  • hxxps://www.mojeek.com/bot.html
  • hxxps://www.netseer.com/crawler/
  • hxxp://filterdb.iss.net/crawler/
  • hxxps://webmaster.petalsearch.com/site/petalbot
  • hxxps://www.comscore.com/proximic-spider
  • User-agent: proximic
  • Disallow: /
  • hxxps://www.safedns.com/en/searchbot/
  • hxxp://www.scoutjet.com/
  • hxxp://lookseek.com/seeker/
  • hxxps://bot.seekport.com
  • hxxps://www.seokicks.de/robot.html
  • To block SEMrushBot from crawling your site for web graph of links, add:
  • To remove SEMrushBot from crawling your site for different SEO and technical issues, add:
  • hxxps://napoveda.seznam.cz/en/full-text-search/crawling-control/
  • hxxps://siteexplorer.info/Backlink-Checker-Spider/
  • hxxps://www.similartech.com/smtbot
  • hxxp://openlinkprofiler.org/bot
  • hxxp://sur.ly/bot.html
  • hxxp://www.trendiction.com/en/publisher/bot
  • hxxps://turnitin.com/robot/crawlerinfo.html
  • hxxp://www.veooz.com/veoozbot.html
  • User-agent: WF search
  • Disallow: /
  • hxxp://www.wotbox.com/bot/
  • hxxps://help.naver.com/support/contents/contents.nhn?serviceNo=19634&categoryNo=19668
  • User-agent: Yeti
  • Disallow: /
  • hxxps://suite.seozoom.it/bot.html