thehareinn.co.uk
robots.txt

Robots Exclusion Standard data for thehareinn.co.uk

Archived Snapshots

Resource Scan

Scan Details

Site Domain	thehareinn.co.uk
Base Domain	thehareinn.co.uk
Scan Status	Ok
Last Scan	2025-10-17T03:31:32+00:00
Next Scan	2025-11-16T03:31:32+00:00

Last Scan

Scanned	2025-10-17T03:31:32+00:00
URL	https://thehareinn.co.uk/robots.txt
Domain IPs	104.21.66.37, 172.67.155.218, 2606:4700:3034::6815:4225, 2606:4700:3034::ac43:9bda
Response IP	104.21.66.37
Found	Yes
Hash	aa319beae8b470865639211975490a978cfe7809d01c2b0162deedb9106e3d13
SimHash	44310353c495

Groups

*

Rule	Path
Allow	/

Rule

Path

Allow

amazonbot

Rule	Path
Disallow	/

Rule

Path

Disallow

applebot-extended

Rule	Path
Disallow	/

Rule

Path

Disallow

bytespider

Rule	Path
Disallow	/

Rule

Path

Disallow

ccbot

Rule	Path
Disallow	/

Rule

Path

Disallow

claudebot

Rule	Path
Disallow	/

Rule

Path

Disallow

google-extended

Rule	Path
Disallow	/

Rule

Path

Disallow

gptbot

Rule	Path
Disallow	/

Rule

Path

Disallow

meta-externalagent

Rule	Path
Disallow	/

Rule

Path

Disallow

*

Rule	Path
Disallow	/cgi-bin/
Disallow	/tmp/
Allow	/amp

Rule

Path

Disallow

/cgi-bin/

Disallow

/tmp/

Allow

/amp

googlebot-mobile

Rule	Path
Allow	/

Rule

Path

Allow

googlebot

Rule	Path
Allow	/

Rule

Path

Allow

googlebot-image

Rule	Path
Allow	/

Rule

Path

Allow

googlebot-news

Rule	Path
Allow	/

Rule

Path

Allow

googlebot-video

Rule	Path
Allow	/

Rule

Path

Allow

bingbot

Rule	Path
Allow	/

Rule

Path

Allow

slurp

Rule	Path
Allow	/

Rule

Path

Allow

duckduckbot

Rule	Path
Allow	/

Rule

Path

Allow

msnbot

Rule	Path
Allow	/

Rule

Path

Allow

yahoo pipes 1.0

Rule	Path
Allow	/

Rule

Path

Allow

yahoo! slurp

Rule	Path
Allow	/

Rule

Path

Allow

baiduspider

Rule	Path
Allow	/

Rule

Path

Allow

baiduspider-news

Rule	Path
Allow	/

Rule

Path

Allow

baiduspider-image

Rule	Path
Allow	/

Rule

Path

Allow

yandexbot

Rule	Path
Allow	/

Rule

Path

Allow

yandeximages

Rule	Path
Allow	/

Rule

Path

Allow

yandexnews

Rule	Path
Allow	/

Rule

Path

Allow

yandexwebmaster

Rule	Path
Allow	/

Rule

Path

Allow

yandexpagechecker

Rule	Path
Allow	/

Rule

Path

Allow

zyborg

Rule	Path
Allow	/

Rule

Path

Allow

exabot

Rule	Path
Allow	/

Rule

Path

Allow

facebot

Rule	Path
Allow	/

Rule

Path

Allow

ia_archiver

Rule	Path
Allow	/

Rule

Path

Allow

archive.org_bot

Rule	Path
Allow	/

Rule

Path

Allow

architextspider

Rule	Path
Allow	/

Rule

Path

Allow

feedfetcher-google

Rule	Path
Allow	/

Rule

Path

Allow

linkedinbot

Rule	Path
Allow	/

Rule

Path

Allow

Other Records

Field	Value
sitemap	https://thehareinn.co.uk/sitemap.xml

Field

Value

sitemap

https://thehareinn.co.uk/sitemap.xml

Comments

As a condition of accessing this website, you agree to abide by the following
content signals:
(a) If a content-signal = yes, you may collect content for the corresponding
use.
(b) If a content-signal = no, you may not collect content for the
corresponding use.
(c) If the website operator does not include a content signal for a
corresponding use, the website operator neither grants nor restricts
permission via content signal with respect to the corresponding use.
The content signals and their meanings are:
search: building a search index and providing search results (e.g., returning
hyperlinks and short excerpts from your website's contents). Search does not
include providing AI-generated search summaries.
ai-input: inputting content into one or more AI models (e.g., retrieval
augmented generation, grounding, or other real-time taking of content for
generative AI search answers).
ai-train: training or fine-tuning AI models.
ANY RESTRICTIONS EXPRESSED VIA CONTENT SIGNALS ARE EXPRESS RESERVATIONS OF
AND RELATED RIGHTS IN THE DIGITAL SINGLE MARKET.
BEGIN Cloudflare Managed content
END Cloudflare Managed Content

Warnings

2 invalid lines.
`content-signal` is not a known field.

thehareinn.co.ukrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

amazonbot

applebot-extended

bytespider

ccbot

claudebot

google-extended

gptbot

meta-externalagent

*

googlebot-mobile

googlebot

googlebot-image

googlebot-news

googlebot-video

bingbot

slurp

duckduckbot

msnbot

yahoo pipes 1.0

yahoo! slurp

baiduspider

baiduspider-news

baiduspider-image

yandexbot

yandeximages

yandexnews

yandexwebmaster

yandexpagechecker

zyborg

exabot

facebot

ia_archiver

archive.org_bot

architextspider

feedfetcher-google

linkedinbot

Other Records

Comments

Warnings

thehareinn.co.uk
robots.txt