elfri.be
robots.txt

Robots Exclusion Standard data for elfri.be

Resource Scan

Scan Details

Site Domain elfri.be
Base Domain elfri.be
Scan Status Ok
Last Scan2025-03-25T14:19:57+00:00
Next Scan 2025-04-24T14:19:57+00:00

Last Scan

Scanned2025-03-25T14:19:57+00:00
URL https://elfri.be/robots.txt
Domain IPs 194.165.51.7
Response IP 194.165.51.7
Found Yes
Hash 0e4fa33563f1d9d566f85ad804ce0f36d7b02af9a4c16b147298876d29ec5dd8
SimHash 211f35914354

Groups

*

Rule Path
Disallow /administrator/
Disallow /bin/
Disallow /cache/
Disallow /cli/
Disallow /components/
Disallow /includes/
Disallow /installation/
Disallow /language/
Disallow /layouts/
Disallow /libraries/
Disallow /logs/
Disallow /modules/
Disallow /plugins/
Disallow /tmp/
Disallow /taxonomy/

baiduspider

Rule Path
Disallow /

boitho.com-dc

Rule Path
Disallow /

busiverse

Rule Path
Disallow /

cazoodlebot

Rule Path
Disallow /

exabot

Rule Path
Disallow /

googlebot-image

Rule Path
Disallow /

heritrix

Rule Path
Disallow /

irlbot

Rule Path
Disallow /

jyxobot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

mirago-test-robot (http://www.miragorobot.com)

Rule Path
Disallow /

msnbot

Rule Path
Disallow /*.gif$
Disallow /*.jpeg$
Disallow /

psbot

Rule Path
Disallow /

sirketce

Rule Path
Disallow /

seekbot

Rule Path
Disallow /

semanticdiscovery

Rule Path
Disallow /

sogou

Rule Path
Disallow /

soso

Rule Path
Disallow /

sosoimagespider

Rule Path
Disallow /

tineye

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

twiceler

Rule Path
Disallow /

voilabot

Rule Path
Disallow /

webalta

Rule Path
Disallow /

yahoo-mmcrawler

Rule Path
Disallow /

yodaobot

Rule Path
Disallow /

zermelo

Rule Path
Disallow /

Comments

  • If the Joomla site is installed within a folder such as at
  • e.g. www.example.com/joomla/ the robots.txt file MUST be
  • moved to the site root at e.g. www.example.com/robots.txt
  • AND the joomla folder name MUST be prefixed to the disallowed
  • path, e.g. the Disallow rule for the /administrator/ folder
  • MUST be changed to read Disallow: /joomla/administrator/
  • For more information about the robots.txt standard, see:
  • http://www.robotstxt.org/orig.html
  • For syntax checking, see:
  • http://tool.motoricerca.info/robots-checker.phtml
  • Disallow Baidu Bot (Japanese)
  • Disallow Boitho dc Bot (Norway)
  • Disallow Busiverse Bot (Turkey Sirketce/Busiverse )
  • Disallow CazoodleBot - from University of Illinois
  • Disallow Exabot Bot - Exalead
  • Disallow Google Image Bot
  • Disallow heritrix Bot - from Yell.Com
  • Disallow IRLbot - IRL Texas AM research bot
  • Disallow Jyxobot - Czech Webcrawler for Jyxo
  • Disallow Majestic12.co.uk
  • Disallow Mirago.com
  • Disallow MSN from seeing gifs and jpgsd
  • Disallow NimbleCrawler (http://www.webmasterworld.com/forum93/858.htm)
  • Disallow psbot spidering of images and hub
  • Disallow Sirketce Bot (Turkey Sirketce/Busiverse )
  • Disallow Seekbot - http://www.seekport.co.uk/seekbot/
  • Disallow semanticdiscovery - from Southern Utah University (compyter Science Dept.)
  • Disallow Sogou - Chinese Search Engine
  • Disallow SoSo - Chinese Search Engine
  • Disallow SoSoImageSpider - Chinese picture Search Engine
  • Disallow TinEye - Image trawler Search Engine
  • Disallow TurnITin - "This robot collects content from the Internet for the sole purpose of helping educational institutions prevent plagiarism"
  • Disallow Twiceler - Cuill (also Barred IPs on firewall)
  • Disallow Voilabot Bot - France Telecom
  • Disallow WebAlta Bot - Russian
  • Disallow Yahoo Image Bot
  • Disallow YodaoBot - Chinese Search Engine
  • Disallow zermelo - Bot du Jour from Amazon - may need to block IP range

Warnings

  • `useragent` is not a known field.