all4home.com.gr
robots.txt

Robots Exclusion Standard data for all4home.com.gr

Resource Scan

Scan Details

Site Domain all4home.com.gr
Base Domain all4home.com.gr
Scan Status Ok
Last Scan2025-05-09T21:28:26+00:00
Next Scan 2025-06-08T21:28:26+00:00

Last Scan

Scanned2025-05-09T21:28:26+00:00
URL https://all4home.com.gr/robots.txt
Domain IPs 37.27.130.227
Response IP 37.27.130.227
Found Yes
Hash 24bc7598024dd8c19cb9d831e9462aee436cd3e370c4417d44d74531e3ba49d6
SimHash aa97189acd65

Groups

*

Rule Path
Disallow /*?page=$
Disallow /*%26page%3D$
Disallow /*?sort
Disallow /*%26sort
Disallow /*?order=
Disallow /*%26order%3D
Disallow /*?limit
Disallow /*%26limit
Disallow /*?filter
Disallow /*%26filter
Disallow /*?route=account%2F
Disallow /*?route=affiliate%2F
Disallow /*?route=checkout%2F
Disallow /*?route=product%2Fsearch
Disallow /*?route=checkout%2Fcart
Disallow /admin
Disallow /account
Disallow /cart
Disallow /checkout
Disallow /login
Disallow /wishlist

ahrefsbot
alexibot
surveybot
rogerbot
exabot
baiduspider
becomebot
cherrypicker
cherry picker
control
crescent internet toolpak
disco pump
doc
download ninja
emailsiphon
fetch
grub
grub-client
htdig
httrack
internetseer.com
k2spider
larbin
libwww
linko
lnspiderguy
mail.ru
mfc foundation class library
microsoft
microsoft.url.control
mj12bot
mozilla/4.0 (compatible; msie 4.0; windows nt)
mozilla/4.0 (compatible; msie 4.0; windows 95)
mozilla/4.0 (compatible; msie 4.0; windows 98)
msiecrawler
netprospector
npbot
offline explorer
oodlebot/1.0
perman surfer
psbot
rpt-httpclient
semrushbot
shopwiki
sitecheck.internetseer.com
sitesnagger
sogou web spider
szukacz
teleport
teleportpro
telesoft
ultraseek
ubicrawler
vspider
webcopier
webemailextractor
webreaper
webster pro
webstripper
webzip
wget
xenu
xenu’s
xenu’s link sleuth 1.1c
xovibot
yacybot
yodaobot
zao
zealbot
zyborg
sosospider
ruby
sentibot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://all4home.com.gr/index.php?route=extension/feed/google_sitemap

Comments

  • Multi - All4home robots.txt
  • This file is to prevent the crawling and indexing of certain parts
  • of your site by web crawlers and spiders run by sites like Yahoo!
  • and Google. By telling these "robots" where not to go on your site,
  • you save bandwidth and server resources.
  • This file will be ignored unless it is at the root of your host:
  • Used: http://example.com/robots.txt
  • Ignored: http://example.com/site/robots.txt
  • For more information about the robots.txt standard, see:
  • https://www.robotstxt.org/robotstxt.html
  • Blocking bad link checker robots

Warnings

  • 1 invalid line.