lms.org.uk
robots.txt

Robots Exclusion Standard data for lms.org.uk

Archived Snapshots

Resource Scan

Scan Details

Site Domain	lms.org.uk
Base Domain	lms.org.uk
Scan Status	Ok
Last Scan	2025-08-06T15:16:50+00:00
Next Scan	2025-09-05T15:16:50+00:00

Last Scan

Scanned	2025-08-06T15:16:50+00:00
URL	https://lms.org.uk/robots.txt
Domain IPs	104.26.8.162, 104.26.9.162, 172.67.70.53, 2606:4700:20::681a:8a2, 2606:4700:20::681a:9a2, 2606:4700:20::ac43:4635
Response IP	104.26.9.162
Found	Yes
Hash	0a76c8905569d66894b6c3b2c9624fa24da824c8a0d6c70095ccec3c2a85f2b7
SimHash	309c5b01c2d8

Groups

*

Rule	Path
Disallow	/chairmansblog3/wp-login.php*
Disallow	/chairmansblog3/.php
Disallow	/sites/default/files/*
Disallow	/includes/
Disallow	/misc/
Disallow	/modules/
Disallow	/profiles/
Disallow	/scripts/
Disallow	/themes/
Disallow	/mass-listings/*
Disallow	/mass-listings-notanymore
Disallow	/mass-listings-notanymore*
Disallow	/mass-listings-notanymore/*
Disallow	/sites/default/files/resource_documents/Trad_Cust_canonical_notes.pdf
Disallow	/CHANGELOG.txt
Disallow	/cron.php
Disallow	/INSTALL.mysql.txt
Disallow	/INSTALL.pgsql.txt
Disallow	/INSTALL.sqlite.txt
Disallow	/install.php
Disallow	/INSTALL.txt
Disallow	/LICENSE.txt
Disallow	/MAINTAINERS.txt
Disallow	/update.php
Disallow	/UPGRADE.txt
Disallow	/xmlrpc.php
Disallow	/tradcust/canonical_notes
Disallow	/admin/
Disallow	/comment/reply/
Disallow	/filter/tips/
Disallow	/node/add/
Disallow	/search/
Disallow	/user/register/
Disallow	/user/password/
Disallow	/user/login/
Disallow	/user/logout/
Disallow	/?q=admin%2F
Disallow	/?q=comment%2Freply%2F
Disallow	/?q=filter%2Ftips%2F
Disallow	/?q=node%2Fadd%2F
Disallow	/?q=search%2F
Disallow	/?q=user%2Fpassword%2F
Disallow	/?q=user%2Fregister%2F
Disallow	/?q=user%2Flogin%2F
Disallow	/?q=user%2Flogout%2F
Disallow	/?q=mass-listings*
Disallow	/mass-listings?*
Disallow	/now/bugoff
Disallow	/photo-files/*
Disallow	/find-a-mass/*
Disallow	/news-and-events/*
Disallow	/resources/shop/*
Disallow	/autodiscover/*

Rule

Path

Disallow

/chairmansblog3/wp-login.php*

Disallow

/chairmansblog3/*.php*

Disallow

/sites/default/files/*

Disallow

/includes/

Disallow

/misc/

Disallow

/modules/

Disallow

/profiles/

Disallow

/scripts/

Disallow

/themes/

Disallow

/mass-listings/*

Disallow

/mass-listings-notanymore

Disallow

/mass-listings-notanymore*

Disallow

/mass-listings-notanymore/*

Disallow

/sites/default/files/resource_documents/Trad_Cust_canonical_notes.pdf

Disallow

/CHANGELOG.txt

Disallow

/cron.php

Disallow

/INSTALL.mysql.txt

Disallow

/INSTALL.pgsql.txt

Disallow

/INSTALL.sqlite.txt

Disallow

/install.php

Disallow

/INSTALL.txt

Disallow

/LICENSE.txt

Disallow

/MAINTAINERS.txt

Disallow

/update.php

Disallow

/UPGRADE.txt

Disallow

/xmlrpc.php

Disallow

/tradcust/canonical_notes

Disallow

/admin/

Disallow

/comment/reply/

Disallow

/filter/tips/

Disallow

/node/add/

Disallow

/search/

Disallow

/user/register/

Disallow

/user/password/

Disallow

/user/login/

Disallow

/user/logout/

Disallow

/?q=admin%2F

Disallow

/?q=comment%2Freply%2F

Disallow

/?q=filter%2Ftips%2F

Disallow

/?q=node%2Fadd%2F

Disallow

/?q=search%2F

Disallow

/?q=user%2Fpassword%2F

Disallow

/?q=user%2Fregister%2F

Disallow

/?q=user%2Flogin%2F

Disallow

/?q=user%2Flogout%2F

Disallow

/?q=mass-listings*

Disallow

/mass-listings?*

Disallow

/now/bugoff

Disallow

/photo-files/*

Disallow

/find-a-mass/*

Disallow

/news-and-events/*

Disallow

/resources/shop/*

Disallow

/autodiscover/*

Other Records

Field	Value
crawl-delay	1760

Field

Value

crawl-delay

1760

baiduspider

Rule	Path
Disallow	/

Rule

Path

Disallow

curl

Rule	Path
Disallow	/

Rule

Path

Disallow

mj12bot

Rule	Path
Disallow	/

Rule

Path

Disallow

ahrefsbot

Rule	Path
Disallow	/

Rule

Path

Disallow

obot

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot

Rule	Path
Disallow	/

Rule

Path

Disallow

baiduspider

Rule	Path
Disallow

Rule

Path

Disallow

curl

Rule	Path
Disallow

Rule

Path

Disallow

mj12bot

Rule	Path
Disallow

Rule

Path

Disallow

ahrefsbot

Rule	Path
Disallow

Rule

Path

Disallow

obot

Rule	Path
Disallow

Rule

Path

Disallow

semrushbot

Rule	Path
Disallow

Rule

Path

Disallow

yandexbot

Rule	Path
Disallow

Rule

Path

Disallow

the knowledge ai

Rule	Path
Disallow	/

Rule

Path

Disallow

*

Rule	Path
Disallow	/cgi-bin/
Disallow	/private/
Disallow	/tmp/
Disallow	/feed
Disallow	/rss
Disallow	/feeds/
Disallow	/photo-files/
Disallow	/cache/
Disallow	/taxonomy/term//all/
Disallow	/church/?page=
Allow	/product/daily-missal-1962
Allow	/product/ordinary-prayers-traditional-latin-mass
Allow	/catalog/missals
Allow	/mp3-chant-downloads
Allow	/about

Rule

Path

Disallow

/cgi-bin/

Disallow

/private/

Disallow

/tmp/

Disallow

/feed

Disallow

/rss

Disallow

/feeds/

Disallow

/photo-files/

Disallow

/cache/

Disallow

/taxonomy/term/*/all/*

Disallow

/church/*?page=*

Allow

/product/daily-missal-1962

Allow

/product/ordinary-prayers-traditional-latin-mass

Allow

/catalog/missals

Allow

/mp3-chant-downloads

Allow

/about

Other Records

Field	Value
crawl-delay	1760

Field

Value

crawl-delay

1760

ai2bot
ai2bot-dolma
aihitbot
amazonbot
andibot
anthropic-ai
applebot
applebot-extended
bedrockbot
brightbot 1.0
bytespider
ccbot
chatgpt-user
claude-searchbot
claude-user
claude-web
claudebot
cohere-ai
cohere-training-data-crawler
cotoyogi
crawlspace
diffbot
duckassistbot
echoboxbot
facebookbot
facebookexternalhit
factset_spyderbot
firecrawlagent
friendlycrawler
google-cloudvertexbot
google-extended
googleother
googleother-image
googleother-video
gptbot
iaskspider/2.0
icc-crawler
imagesiftbot
img2dataset
isscyberriskcrawler
kangaroo bot
meta-externalagent
meta-externalagent
meta-externalfetcher
meta-externalfetcher
mistralai-user/1.0
mycentralaiscraperbot
novaact
oai-searchbot
omgili
omgilibot
operator
pangubot
panscient
panscient.com
perplexity-user
perplexitybot
petalbot
phindbot
poseidon research crawler
qualifiedbot
quillbot
quillbot.com
sbintuitionsbot
scrapy
semrushbot
semrushbot-ba
semrushbot-ct
semrushbot-ocob
semrushbot-si
semrushbot-swa
sidetrade indexer bot
tiktokspider
timpibot
velenpublicwebcrawler
webzio-extended
wpbot
yandexadditional
yandexadditionalbot
youbot

Rule	Path
Disallow	/

Rule

Path

Disallow

Comments

robots.txt
This file is to prevent the crawling and indexing of certain parts
of your site by web crawlers and spiders run by sites like Yahoo!
and Google. By telling these "robots" where not to go on your site,
you save bandwidth and server resources.
This file will be ignored unless it is at the root of your host:
Used: http://example.com/robots.txt
Ignored: http://example.com/site/robots.txt
For more information about the robots.txt standard, see:
http://www.robotstxt.org/robotstxt.html
Directories
Disallow: /mass-listings/
Disallow: /mass-listings
Disallow: /mass-listings*
Files
added by gjcopp 21-07-21 remove when document is published
Paths (clean URLs)
added by gjcopp 21-07-21 remove when document is published
Paths (no clean URLs)
spiderslap

lms.org.ukrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

Other Records

baiduspider

curl

mj12bot

ahrefsbot

obot

semrushbot

baiduspider

curl

mj12bot

ahrefsbot

obot

semrushbot

yandexbot

the knowledge ai

*

Other Records

Comments

lms.org.uk
robots.txt