dsa.gr
robots.txt

Robots Exclusion Standard data for dsa.gr

Resource Scan

Scan Details

Site Domain dsa.gr
Base Domain dsa.gr
Scan Status Ok
Last Scan2024-06-04T10:46:51+00:00
Next Scan 2024-07-04T10:46:51+00:00

Last Scan

Scanned2024-06-04T10:46:51+00:00
URL https://dsa.gr/robots.txt
Redirect https://www.dsa.gr/robots.txt
Redirect Domain www.dsa.gr
Redirect Base dsa.gr
Domain IPs 104.21.47.216, 172.67.172.207, 2606:4700:3031::6815:2fd8, 2606:4700:3031::ac43:accf
Redirect IPs 104.21.47.216, 172.67.172.207, 2606:4700:3031::6815:2fd8, 2606:4700:3031::ac43:accf
Response IP 104.21.47.216
Found Yes
Hash 334c8b2be6179f949b80c25ee6611a93469b584daf7e81bbaec1aa483252e517
SimHash 32167c08c570

Groups

*

Rule Path
Disallow /includes/
Disallow /misc/
Disallow /modules/
Disallow /profiles/
Disallow /scripts/
Disallow /themes/
Disallow /CHANGELOG.txt
Disallow /cron.php
Disallow /INSTALL.mysql.txt
Disallow /INSTALL.pgsql.txt
Disallow /install.php
Disallow /INSTALL.txt
Disallow /LICENSE.txt
Disallow /MAINTAINERS.txt
Disallow /update.php
Disallow /UPGRADE.txt
Disallow /xmlrpc.php
Disallow /manage-dsa/
Disallow /admin/
Disallow /comment/reply/
Disallow /filter/tips/
Disallow /logout/
Disallow /node/add/
Disallow /search/
Disallow /user/register/
Disallow /user/password/
Disallow /user/login/
Disallow /user/user-activation/
Disallow /user/user-activation-resend-email/
Disallow /?q=admin%2F
Disallow /?q=comment%2Freply%2F
Disallow /?q=filter%2Ftips%2F
Disallow /?q=logout%2F
Disallow /?q=node%2Fadd%2F
Disallow /?q=search%2F
Disallow /?q=user%2Fpassword%2F
Disallow /?q=user%2Fregister%2F
Disallow /?q=user%2Flogin%2F

Other Records

Field Value
crawl-delay 10

businessseek

Rule Path
Disallow /

blekkobot

Rule Path
Disallow /

safednsbot

Rule Path
Disallow /

scoutjet

Rule Path
Disallow /

bubing

Rule Path
Disallow /

daumoa

Rule Path
Disallow /

mojeekbot

Rule Path
Disallow /

istellabot

Rule Path
Disallow /

seebot

Rule Path
Disallow /

spbot

Rule Path
Disallow /

exabot

Rule Path
Disallow /

siteexplorer

Rule Path
Disallow /

wegtam crawler

Rule Path
Disallow /

seoengworldbot

Rule Path
Disallow /

nutch

Rule Path
Disallow /

nutch-1.5

Rule Path
Disallow /

netestate ne crawler

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

obot

Rule Path
Disallow /

genieo

Rule Path
Disallow /

facebot/1.0

Rule Path
Disallow /

thunderstone

Rule Path
Disallow /

nutch-1.7

Rule Path
Disallow /

grapeshotcrawler

Rule Path
Disallow /

seokicks-robot

Rule Path
Disallow /

netestate ne crawler (+http://www.website-datenbank.de/)

Rule Path
Disallow /

wbsearchbot

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

yeti

Rule Path
Disallow /

wada.vn

Rule Path
Disallow /

panscient.com

Rule Path
Disallow /

w3c-checklink

Rule Path
Disallow

super-goo

Rule Path
Disallow /

ncbot

Rule Path
Disallow /

meanpathbot

Rule Path
Disallow /

seostats

Rule Path
Disallow /

acoon.de

Rule Path
Disallow /

wscheck.com

Rule Path
Disallow /

checks.panopta.com

Rule Path
Disallow /

pagesinventory

Rule Path
Disallow /

aboundex

Rule Path
Disallow /

sosospider

Rule Path
Disallow /

seznambot

Rule Path
Disallow /

healthbot

Rule Path
Disallow /

findestars

Rule Path
Disallow /

myonid

Rule Path
Disallow /

peekyou

Rule Path
Disallow /

aihit

Rule Path
Disallow /

pipl

Rule Path
Disallow /

rapleaf

Rule Path
Disallow /

snitch

Rule Path
Disallow /

spock

Rule Path
Disallow /

tweepz

Rule Path
Disallow /

wink

Rule Path
Disallow /

yasni

Rule Path
Disallow /

yandex

Rule Path
Disallow /

yoname

Rule Path
Disallow /

youdaobot

Rule Path
Disallow /

yourtraces

Rule Path
Disallow /

zoominfo

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

huaweisymantecspider

Rule Path
Disallow /

pagepeeker

Rule Path
Disallow /

pagespeed/1.1 fetcher

Rule Path
Disallow /

psbot

Rule Path
Disallow /

garlikcrawler

Rule Path
Disallow /

betabot

Rule Path
Disallow /

hn.kd.ny.adsl

Rule Path
Disallow /

nextgensearchbot

Rule Path
Disallow /

Comments

  • robots.txt
  • This file is to prevent the crawling and indexing of certain parts
  • of your site by web crawlers and spiders run by sites like Yahoo!
  • and Google. By telling these "robots" where not to go on your site,
  • you save bandwidth and server resources.
  • This file will be ignored unless it is at the root of your host:
  • Used: http://example.com/robots.txt
  • Ignored: http://example.com/site/robots.txt
  • For more information about the robots.txt standard, see:
  • http://www.robotstxt.org/wc/robots.html
  • For syntax checking, see:
  • http://www.sxw.org.uk/computing/robots/check.html
  • Directories
  • Files
  • Paths (clean URLs)
  • Paths (no clean URLs)
  • blocking people search engines

Warnings

  • 2 invalid lines.