techdonut.co.uk
robots.txt

Robots Exclusion Standard data for techdonut.co.uk

Archived Snapshots

Resource Scan

Scan Details

Site Domain	techdonut.co.uk
Base Domain	techdonut.co.uk
Scan Status	Ok
Last Scan	2025-09-27T23:10:00+00:00
Next Scan	2025-10-04T23:10:00+00:00

Last Scan

Scanned	2025-09-27T23:10:00+00:00
URL	https://techdonut.co.uk/robots.txt
Redirect	https://www.techdonut.co.uk/robots.txt
Redirect Domain	www.techdonut.co.uk
Redirect Base	techdonut.co.uk
Domain IPs	172.66.40.144, 172.66.43.112, 2606:4700:3108::ac42:2890, 2606:4700:3108::ac42:2b70
Redirect IPs	172.66.40.144, 172.66.43.112, 2606:4700:3108::ac42:2890, 2606:4700:3108::ac42:2b70
Response IP	172.66.43.112
Found	Yes
Hash	04e4fe22ffb227f104253b7c061166b73cb4ba9e7b37310af3cb86979e8b50a8
SimHash	3896bd41c560

Groups

*

Rule	Path
Allow	/core/*.css$
Allow	/core/*.css?
Allow	/core/*.js$
Allow	/core/*.js?
Allow	/core/*.gif
Allow	/core/*.jpg
Allow	/core/*.jpeg
Allow	/core/*.png
Allow	/core/*.svg
Allow	/profiles/*.css$
Allow	/profiles/*.css?
Allow	/profiles/*.js$
Allow	/profiles/*.js?
Allow	/profiles/*.gif
Allow	/profiles/*.jpg
Allow	/profiles/*.jpeg
Allow	/profiles/*.png
Allow	/profiles/*.svg
Allow	/sitemap.xml
Disallow	/core/
Disallow	/profiles/
Disallow	/README.txt
Disallow	/web.config
Disallow	/admin/
Disallow	/comment/reply/
Disallow	/filter/tips/
Disallow	/node/add/
Disallow	/search/
Disallow	/user/register/
Disallow	/user/password/
Disallow	/user/login/
Disallow	/user/logout/
Disallow	/index.php/admin/
Disallow	/index.php/comment/reply/
Disallow	/index.php/filter/tips/
Disallow	/index.php/node/add/
Disallow	/index.php/search/
Disallow	/index.php/user/password/
Disallow	/index.php/user/register/
Disallow	/index.php/user/login/
Disallow	/index.php/user/logout/
Disallow	/37353961/
Disallow	/taxonomy/term/

Rule

Path

Allow

/core/*.css$

Allow

/core/*.css?

Allow

/core/*.js$

Allow

/core/*.js?

Allow

/core/*.gif

Allow

/core/*.jpg

Allow

/core/*.jpeg

Allow

/core/*.png

Allow

/core/*.svg

Allow

/profiles/*.css$

Allow

/profiles/*.css?

Allow

/profiles/*.js$

Allow

/profiles/*.js?

Allow

/profiles/*.gif

Allow

/profiles/*.jpg

Allow

/profiles/*.jpeg

Allow

/profiles/*.png

Allow

/profiles/*.svg

Allow

/sitemap.xml

Disallow

/core/

Disallow

/profiles/

Disallow

/README.txt

Disallow

/web.config

Disallow

/admin/

Disallow

/comment/reply/

Disallow

/filter/tips/

Disallow

/node/add/

Disallow

/search/

Disallow

/user/register/

Disallow

/user/password/

Disallow

/user/login/

Disallow

/user/logout/

Disallow

/index.php/admin/

Disallow

/index.php/comment/reply/

Disallow

/index.php/filter/tips/

Disallow

/index.php/node/add/

Disallow

/index.php/search/

Disallow

/index.php/user/password/

Disallow

/index.php/user/register/

Disallow

/index.php/user/login/

Disallow

/index.php/user/logout/

Disallow

/37353961/

Disallow

/taxonomy/term/

chatgpt-user

Rule	Path
Allow	/

Rule

Path

Allow

google-extended

Rule	Path
Allow	/

Rule

Path

Allow

claudebot

Rule	Path
Allow	/

Rule

Path

Allow

perplexitybot

Rule	Path
Allow	/

Rule

Path

Allow

applebot

Rule	Path
Allow	/

Rule

Path

Allow

oai-searchbot

Rule	Path
Allow	/

Rule

Path

Allow

firecrawlagent

Rule	Path
Allow	/

Rule

Path

Allow

andibot

Rule	Path
Allow	/

Rule

Path

Allow

exabot

Rule	Path
Allow	/

Rule

Path

Allow

phindbot

Rule	Path
Allow	/

Rule

Path

Allow

youbot

Rule	Path
Allow	/

Rule

Path

Allow

gptbot

Rule	Path
Disallow	/

Rule

Path

Disallow

claude-web

Rule	Path
Disallow	/

Rule

Path

Disallow

ccbot

Rule	Path
Disallow	/

Rule

Path

Disallow

bytespider

Rule	Path
Disallow	/

Rule

Path

Disallow

amazonbot

Rule	Path
Disallow	/

Rule

Path

Disallow

ahrefsbot

Rule	Path
Disallow	/

Rule

Path

Disallow

facebookbot

Rule	Path
Disallow	/

Rule

Path

Disallow

facebookexternalhit

Rule	Path
Disallow	/

Rule

Path

Disallow

sogou

Rule	Path
Disallow	/

Rule

Path

Disallow

yandex

Rule	Path
Disallow	/

Rule

Path

Disallow

googlebot

Rule	Path
Allow	/

Rule

Path

Allow

bingbot

Rule	Path
Allow	/

Rule

Path

Allow

*

Rule	Path
Allow	/

Rule

Path

Allow

Comments

robots.txt
This file is to prevent the crawling and indexing of certain parts
of your site by web crawlers and spiders run by sites like Yahoo!
and Google. By telling these "robots" where not to go on your site,
you save bandwidth and server resources.
This file will be ignored unless it is at the root of your host:
Used: http://example.com/robots.txt
Ignored: http://example.com/site/robots.txt
For more information about the robots.txt standard, see:
http://www.robotstxt.org/robotstxt.html
CSS, JS, Images
Directories
Files
Paths (clean URLs)
Paths (no clean URLs)
DFP Codes
Others
========================================
Allow trusted AI user-agents
========================================
========================================
Block known AI training crawlers & scrapers
========================================
========================================
Allow traditional search indexing
========================================
========================================
Default: Allow all other user agents
========================================

techdonut.co.ukrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

chatgpt-user

google-extended

claudebot

perplexitybot

applebot

oai-searchbot

firecrawlagent

andibot

exabot

phindbot

youbot

gptbot

claude-web

ccbot

bytespider

amazonbot

ahrefsbot

facebookbot

facebookexternalhit

sogou

yandex

googlebot

bingbot

*

Comments

techdonut.co.uk
robots.txt