/.well-known/

Log In Sign Up

docs.joomla.org
robots.txt

Robots Exclusion Standard data for docs.joomla.org

Archived Snapshots

Resource Scan

Scan Details

Site Domain	docs.joomla.org
Base Domain	joomla.org
Scan Status	Ok
Last Scan	2025-07-21T04:01:52+00:00
Next Scan	2025-08-04T04:01:52+00:00

Last Scan

Scanned	2025-07-21T04:01:52+00:00
URL	https://docs.joomla.org/robots.txt
Domain IPs	104.26.14.15, 104.26.15.15, 172.67.74.86, 2606:4700:20::681a:e0f, 2606:4700:20::681a:f0f, 2606:4700:20::ac43:4a56
Response IP	172.67.74.86
Found	Yes
Hash	efad072451ab1b85941009d16f1e88cc2bfe6e99fd027221e740ff9679d31a8c
SimHash	76f84b05c7c4

Groups

ai2bot
ai2bot-dolma
aihitbot
amazonbot
andibot
anthropic-ai
applebot
applebot-extended
brightbot 1.0
bytespider
ccbot
chatgpt-user
claude-searchbot
claude-user
claude-web
claudebot
cohere-ai
cohere-training-data-crawler
cotoyogi
crawlspace
diffbot
duckassistbot
facebookbot
factset_spyderbot
firecrawlagent
friendlycrawler
google-cloudvertexbot
google-extended
googleother
googleother-image
googleother-video
gptbot
iaskspider/2.0
icc-crawler
imagesiftbot
img2dataset
isscyberriskcrawler
kangaroo bot
meta-externalagent
meta-externalagent
meta-externalfetcher
meta-externalfetcher
mistralai-user/1.0
novaact
oai-searchbot
omgili
omgilibot
operator
pangubot
perplexity-user
perplexitybot
petalbot
phindbot
qualifiedbot
scrapy
semrushbot-ocob
semrushbot-swa
sidetrade indexer bot
tiktokspider
timpibot
velenpublicwebcrawler
webzio-extended
wpbot
youbot

Rule

Path

Disallow

/

*

Rule

Path

Allow

/

Back to top

Comments

Block any non-specified AI crawlers (e.g., new
or unknown bots) from using content for training
AI models, while allowing the website to be
indexed and accessed by bots. These directives
are still experimental and may not be supported
by all AI crawlers.

Back to top

Warnings

`content-usage` is not a known field.
`disallowaitraining` is not a known field.

Back to top