europe.study
robots.txt

Robots Exclusion Standard data for europe.study

Archived Snapshots

Resource Scan

Scan Details

Site Domain	europe.study
Base Domain	europe.study
Scan Status	Failed
Failure Stage	Fetching resource.
Failure Reason	Server returned a client error.
Last Scan	2025-06-10T11:49:05+00:00
Next Scan	2025-08-09T11:49:05+00:00

Last Successful Scan

Scanned	2025-04-12T11:43:51+00:00
URL	https://europe.study/robots.txt
Redirect	https://www.europe.study/robots.txt
Redirect Domain	www.europe.study
Redirect Base	europe.study
Domain IPs	104.26.4.28, 104.26.5.28, 172.67.69.58, 2606:4700:20::681a:41c, 2606:4700:20::681a:51c, 2606:4700:20::ac43:453a
Redirect IPs	104.26.4.28, 104.26.5.28, 172.67.69.58, 2606:4700:20::681a:41c, 2606:4700:20::681a:51c, 2606:4700:20::ac43:453a
Response IP	104.26.4.28
Found	Yes
Hash	ac73756b66e418b82b051fda5f437ab55a3d498ee8d2ad48638eb012ca7517fe
SimHash	711f0958c9c0

Groups

*

Rule	Path
Disallow	/administrator/
Disallow	/api/
Disallow	/bin/
Disallow	/cache/
Disallow	/cli/
Disallow	/components/
Disallow	/includes/
Disallow	/installation/
Disallow	/language/
Disallow	/layouts/
Disallow	/libraries/
Disallow	/logs/
Disallow	/modules/
Disallow	/plugins/
Disallow	/tmp/
Disallow	/cdn-cgi/

Rule

Path

Disallow

/administrator/

Disallow

/api/

Disallow

/bin/

Disallow

/cache/

Disallow

/cli/

Disallow

/components/

Disallow

/includes/

Disallow

/installation/

Disallow

/language/

Disallow

/layouts/

Disallow

/libraries/

Disallow

/logs/

Disallow

/modules/

Disallow

/plugins/

Disallow

/tmp/

Disallow

/cdn-cgi/

anthropic-ai

Rule	Path
Disallow	/

Rule

Path

Disallow

amazonbot

Rule	Path
Disallow	/

Rule

Path

Disallow

applebot-extended

Rule	Path
Disallow	/

Rule

Path

Disallow

awariorssbot

Rule	Path
Disallow	/

Rule

Path

Disallow

awariosmartbot

Rule	Path
Disallow	/

Rule

Path

Disallow

bytespider

Rule	Path
Disallow	/

Rule

Path

Disallow

claudebot

Rule	Path
Disallow	/

Rule

Path

Disallow

ccbot

Rule	Path
Disallow	/

Rule

Path

Disallow

chatgpt-user

Rule	Path
Disallow	/

Rule

Path

Disallow

cohere-ai

Rule	Path
Disallow	/

Rule

Path

Disallow

claude-web

Rule	Path
Disallow	/

Rule

Path

Disallow

diffbot

Rule	Path
Disallow	/

Rule

Path

Disallow

dataforseobot

Rule	Path
Disallow	/

Rule

Path

Disallow

facebookbot

Rule	Path
Disallow	/

Rule

Path

Disallow

friendlycrawler

Rule	Path
Disallow	/

Rule

Path

Disallow

gptbot

Rule	Path
Disallow	/

Rule

Path

Disallow

google-extended

Rule	Path
Disallow	/

Rule

Path

Disallow

google-cloudvertexbot

Rule	Path
Disallow	/

Rule

Path

Disallow

ia_archiver

Rule	Path
Disallow	/

Rule

Path

Disallow

imagesiftbot

Rule	Path
Disallow	/

Rule

Path

Disallow

magpie-crawler

Rule	Path
Disallow	/

Rule

Path

Disallow

mj12bot

Rule	Path
Disallow	/

Rule

Path

Disallow

meta-externalagent
meta-externalagent

Rule	Path
Disallow	/

Rule

Path

Disallow

newsnow

Rule	Path
Disallow	/

Rule

Path

Disallow

news-please

Rule	Path
Disallow	/

Rule

Path

Disallow

omgili

Rule	Path
Disallow	/

Rule

Path

Disallow

omgilibot

Rule	Path
Disallow	/

Rule

Path

Disallow

oai-searchbot

Rule	Path
Disallow	/

Rule

Path

Disallow

petalbot

Rule	Path
Disallow	/

Rule

Path

Disallow

piplbot

Rule	Path
Disallow	/

Rule

Path

Disallow

perplexitybot

Rule	Path
Disallow	/

Rule

Path

Disallow

peer39_crawler
peer39_crawler/1.0

Rule	Path
Disallow	/

Rule

Path

Disallow

quora-bot

Rule	Path
Disallow	/

Rule

Path

Disallow

scrapy

Rule	Path
Disallow	/

Rule

Path

Disallow

turnitinbot

Rule	Path
Disallow	/

Rule

Path

Disallow

youbot

Rule	Path
Disallow	/

Rule

Path

Disallow

Other Records

Field	Value
sitemap	https://www.europe.study/sitemap.xml

Field

Value

sitemap

https://www.europe.study/sitemap.xml

Comments

If the Joomla site is installed within a folder
eg www.example.com/joomla/ then the robots.txt file
MUST be moved to the site root
eg www.example.com/robots.txt
AND the joomla folder name MUST be prefixed to all of the
paths.
eg the Disallow rule for the /administrator/ folder MUST
be changed to read
Disallow: /joomla/administrator/
For more information about the robots.txt standard, see:
https://www.robotstxt.org/orig.html
/cdn-cgi added by cloudflare so sould be in the list of dissalow

europe.studyrobots.txt

Resource Scan

Scan Details

Last Successful Scan

Groups

*

anthropic-ai

amazonbot

applebot-extended

awariorssbot

awariosmartbot

bytespider

claudebot

ccbot

chatgpt-user

cohere-ai

claude-web

diffbot

dataforseobot

facebookbot

friendlycrawler

gptbot

google-extended

google-cloudvertexbot

ia_archiver

imagesiftbot

magpie-crawler

mj12bot

meta-externalagentmeta-externalagent

newsnow

news-please

omgili

omgilibot

oai-searchbot

petalbot

piplbot

perplexitybot

peer39_crawlerpeer39_crawler/1.0

quora-bot

scrapy

turnitinbot

youbot

Other Records

Comments

europe.study
robots.txt

meta-externalagent
meta-externalagent

peer39_crawler
peer39_crawler/1.0