europe.study
robots.txt

Robots Exclusion Standard data for europe.study

Resource Scan

Scan Details

Site Domain europe.study
Base Domain europe.study
Scan Status Failed
Failure StageFetching resource.
Failure ReasonServer returned a client error.
Last Scan2025-06-10T11:49:05+00:00
Next Scan 2025-08-09T11:49:05+00:00

Last Successful Scan

Scanned2025-04-12T11:43:51+00:00
URL https://europe.study/robots.txt
Redirect https://www.europe.study/robots.txt
Redirect Domain www.europe.study
Redirect Base europe.study
Domain IPs 104.26.4.28, 104.26.5.28, 172.67.69.58, 2606:4700:20::681a:41c, 2606:4700:20::681a:51c, 2606:4700:20::ac43:453a
Redirect IPs 104.26.4.28, 104.26.5.28, 172.67.69.58, 2606:4700:20::681a:41c, 2606:4700:20::681a:51c, 2606:4700:20::ac43:453a
Response IP 104.26.4.28
Found Yes
Hash ac73756b66e418b82b051fda5f437ab55a3d498ee8d2ad48638eb012ca7517fe
SimHash 711f0958c9c0

Groups

*

Rule Path
Disallow /administrator/
Disallow /api/
Disallow /bin/
Disallow /cache/
Disallow /cli/
Disallow /components/
Disallow /includes/
Disallow /installation/
Disallow /language/
Disallow /layouts/
Disallow /libraries/
Disallow /logs/
Disallow /modules/
Disallow /plugins/
Disallow /tmp/
Disallow /cdn-cgi/

anthropic-ai

Rule Path
Disallow /

amazonbot

Rule Path
Disallow /

applebot-extended

Rule Path
Disallow /

awariorssbot

Rule Path
Disallow /

awariosmartbot

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

diffbot

Rule Path
Disallow /

dataforseobot

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /

friendlycrawler

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

google-cloudvertexbot

Rule Path
Disallow /

ia_archiver

Rule Path
Disallow /

imagesiftbot

Rule Path
Disallow /

magpie-crawler

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

meta-externalagent
meta-externalagent

Rule Path
Disallow /

newsnow

Rule Path
Disallow /

news-please

Rule Path
Disallow /

omgili

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

oai-searchbot

Rule Path
Disallow /

petalbot

Rule Path
Disallow /

piplbot

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

peer39_crawler
peer39_crawler/1.0

Rule Path
Disallow /

quora-bot

Rule Path
Disallow /

scrapy

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

youbot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.europe.study/sitemap.xml

Comments

  • If the Joomla site is installed within a folder
  • eg www.example.com/joomla/ then the robots.txt file
  • MUST be moved to the site root
  • eg www.example.com/robots.txt
  • AND the joomla folder name MUST be prefixed to all of the
  • paths.
  • eg the Disallow rule for the /administrator/ folder MUST
  • be changed to read
  • Disallow: /joomla/administrator/
  • For more information about the robots.txt standard, see:
  • https://www.robotstxt.org/orig.html
  • /cdn-cgi added by cloudflare so sould be in the list of dissalow