swtch.co.uk
robots.txt

Robots Exclusion Standard data for swtch.co.uk

Resource Scan

Scan Details

Site Domain swtch.co.uk
Base Domain swtch.co.uk
Scan Status Ok
Last Scan2024-09-17T05:58:23+00:00
Next Scan 2024-10-17T05:58:23+00:00

Last Scan

Scanned2024-09-17T05:58:23+00:00
URL https://swtch.co.uk/robots.txt
Redirect https://www.swtch.co.uk/robots.txt
Redirect Domain www.swtch.co.uk
Redirect Base swtch.co.uk
Domain IPs 51.89.222.68
Redirect IPs 104.16.150.108, 104.16.151.108, 2606:4700::6810:966c, 2606:4700::6810:976c
Response IP 104.16.150.108
Found Yes
Hash dbd3c4ad36a3d145307f716479d10a8a4d101912d00d001b4c99e9373c00297a
SimHash f2305149eef5

Groups

*

Rule Path
Disallow /wp-admin/
Disallow /*add-to-cart%3D*
Disallow /cart/
Disallow /checkout/
Disallow /my-account/
Allow /wp-admin/admin-ajax.php
Allow /sitemap.xml

ubicrawler

Rule Path
Disallow /

doc

Rule Path
Disallow /

zao

Rule Path
Disallow /

sitecheck.internetseer.com

Rule Path
Disallow /

zealbot

Rule Path
Disallow /

msiecrawler

Rule Path
Disallow /

sitesnagger

Rule Path
Disallow /

webstripper

Rule Path
Disallow /

webcopier

Rule Path
Disallow /

fetch

Rule Path
Disallow /

offline explorer

Rule Path
Disallow /

teleport

Rule Path
Disallow /

teleportpro

Rule Path
Disallow /

webzip

Rule Path
Disallow /

linko

Rule Path
Disallow /

httrack

Rule Path
Disallow /

microsoft.url.control

Rule Path
Disallow /

xenu

Rule Path
Disallow /

larbin

Rule Path
Disallow /

libwww

Rule Path
Disallow /

zyborg

Rule Path
Disallow /

download ninja

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

fast

Rule Path
Disallow /

wget

Rule Path
Disallow /

grub-client

Rule Path
Disallow /

k2spider

Rule Path
Disallow /

npbot

Rule Path
Disallow /

webreaper

Rule Path
Disallow /

mail.ru_bot

Rule Path Comment
Disallow / blocks access to the entire site

wesee

Rule Path
Disallow /

grapeshot

Rule Path
Disallow /

maxpointcrawler

Rule Path
Disallow /

curious george

Rule Path
Disallow /

magpie-crawler

Rule Path
Disallow /

trovitbot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

semrushbot-sa

Rule Path
Disallow /

vegi bot

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

megaindex.ru

Rule Path
Disallow /

megaindex.com

Rule Path
Disallow /

voltron

Rule Path
Disallow /

proximic

Rule Path
Disallow /

Comments

  • Copied from wikipedia robots.txt
  • Crawlers that are kind enough to obey, but which we'd rather not have
  • unless they're feeding search engines.
  • Some bots are known to be trouble, particularly those designed to copy
  • entire sites. Please obey robots.txt.
  • Misbehaving: requests much too fast:
  • Sorry, wget in its recursive mode is a frequent problem.
  • Please read the man page and use it properly; there is a
  • --wait option you can use to set the delay between hits,
  • for instance.
  • The 'grub' distributed client has been *very* poorly behaved.
  • Doesn't follow robots.txt anyway, but...
  • Hits many times per second, not acceptable
  • http://www.nameprotect.com/botinfo.html
  • A capture bot, downloads gazillions of pages with no public benefit
  • http://www.webreaper.net/
  • https://www.grapeshot.com/crawler/
  • Local advertising crawler
  • SEO - http://www.analyticsseo.com/the-analytics-seo-crawler-curious-george/
  • https://www.brandwatch.com/magpie-crawler/
  • http://www.trovit.com/bot.html
  • https://megaindex.com/crawler

Warnings

  • 4 invalid lines.