doks-innovation.com
robots.txt

Robots Exclusion Standard data for doks-innovation.com

Resource Scan

Scan Details

Site Domain doks-innovation.com
Base Domain doks-innovation.com
Scan Status Ok
Last Scan2025-12-25T23:22:48+00:00
Next Scan 2026-01-24T23:22:48+00:00

Last Scan

Scanned2025-12-25T23:22:48+00:00
URL https://doks-innovation.com/robots.txt
Domain IPs 81.209.248.207
Response IP 81.209.248.207
Found Yes
Hash aa2f5ec0ba67e8b52f12b11e39aaa448bb359efa96f3a9a4c77b5e9c85a04d52
SimHash e1dcd9015dc8

Groups

addsearchbot
ai2bot
ai2bot-dolma
aihitbot
amazonbot
andibot
anthropic-ai
applebot-extended
bedrockbot
bigsur.ai
brightbot 1.0
bytespider
chatgpt agent
claudebot
cloudvertexbot
cohere-ai
cohere-training-data-crawler
cotoyogi
crawlspace
datenbank crawler
devin
diffbot
echobot bot
echoboxbot
facebookbot
facebookexternalhit
factset_spyderbot
firecrawlagent
friendlycrawler
google-cloudvertexbot
google-extended
googleagent-mariner
googleother
googleother-image
googleother-video
gptbot
iaskspider/2.0
icc-crawler
imagesiftbot
img2dataset
imgproxy
isscyberriskcrawler
kangaroo bot
linerbot
meta-externalagent
meta-externalagent
meta-externalfetcher
meta-externalfetcher
mycentralaiscraperbot
netestate imprint crawler
novaact
omgili
omgilibot
openai
operator
pangubot
panscient
panscient.com
perplexity-user
perplexitybot
petalbot
phindbot
poseidon research crawler
qualifiedbot
quillbot
quillbot.com
sbintuitionsbot
scrapy
semrushbot-ocob
semrushbot-swa
sidetrade indexer bot
tiktokspider
timpibot
velenpublicwebcrawler
webzio-extended
wpbot
yak
yandexadditional
yandexadditionalbot
youbot

Rule Path
Disallow /

ahrefsbot
applebot
bingbot
bravebot
ccbot
chatgpt-user
claude-searchbot
claude-user
claude-web
duckassistbot
duckduckbot
ecosia
gemini-deep-research
googlebot
ia_archiver
kagibot
mistralai-user
mistralai-user/1.0
mj12bot
oai-searchbot
qwantbot
qwantify
rytebot
semrushbot
sistrix
startpagebot

Rule Path
Disallow /*.0
Disallow /*.1
Disallow /*.2
Disallow /*.3
Disallow /*.4
Disallow /*.5
Disallow /*.6
Disallow /*.7
Disallow /*.7z
Disallow /*.8
Disallow /*.9
Disallow /*.app
Disallow /*.application
Disallow /*.backup
Disallow /*.bak
Disallow /*.bin
Disallow /*.bz2
Disallow /*.cfg
Disallow /*.cgi
Disallow /*.conf
Disallow /*.config
Disallow /*.crt
Disallow /*.csr
Disallow /*.css
Disallow /*.csv
Disallow /*.dat
Disallow /*.db
Disallow /*.dev
Disallow /*.disabled
Disallow /*.dist
Disallow /*.doc
Disallow /*.docx
Disallow /*.env
Disallow /*.example
Disallow /*.exe
Disallow /*.feed
Disallow /*.gz
Disallow /*.ics
Disallow /*.ini
Disallow /*.js
Disallow /*.json
Disallow /*.kdbx
Disallow /*.key
Disallow /*.local
Disallow /*.lock
Disallow /*.log
Disallow /*.md
Disallow /*.mjs
Disallow /*.mp4
Disallow /*.new
Disallow /*.numbers
Disallow /*.odp
Disallow /*.ods
Disallow /*.odt
Disallow /*.old
Disallow /*.orig
Disallow /*.original
Disallow /*.pages
Disallow /*.pem
Disallow /*.php7
Disallow /*.pl
Disallow /*.ppt
Disallow /*.pptx
Disallow /*.prod
Disallow /*.production
Disallow /*.properties
Disallow /*.psd
Disallow /*.py
Disallow /*.rar
Disallow /*.rb
Disallow /*.rtf
Disallow /*.save
Disallow /*.sh
Disallow /*.sql
Disallow /*.sqlite
Disallow /*.sqlite3
Disallow /*.staging
Disallow /*.temp
Disallow /*.testing
Disallow /*.tgz
Disallow /*.tmp
Disallow /*.tsv
Disallow /*.txt
Disallow /*.vcf
Disallow /*.woff
Disallow /*.woff2
Disallow /*.xls
Disallow /*.xlsx
Disallow /*.xz
Disallow /*.yaml
Disallow /*.yml
Disallow /*.zip
Disallow /.env
Disallow /.env.local
Disallow /.env.production
Disallow /.bzr/
Disallow /.git/
Disallow /.hg/
Disallow /.pki/
Disallow /.ssh/
Disallow /.svn/
Disallow /3rdparty/
Disallow /admin/
Disallow /administrator/
Disallow /assets/
Disallow /backups/
Disallow /bin/
Disallow /cache/
Disallow /cfg/
Disallow /cgi-bin/
Disallow /classes/
Disallow /conf/
Disallow /config/
Disallow /core/
Disallow /dist/
Disallow /docs/
Disallow /export/
Disallow /extensions/
Disallow /fonts/
Disallow /git/
Disallow /includes/
Disallow /install/
Disallow /installer/
Disallow /js/
Disallow /layouts/
Disallow /lib/
Disallow /libraries/
Disallow /log/
Disallow /logs/
Disallow /maintenance/
Disallow /modules/
Disallow /node_modules/
Disallow /plugins/
Disallow /scripts/
Disallow /settings/
Disallow /setup/
Disallow /skins/
Disallow /src/
Disallow /templates/
Disallow /templates_c/
Disallow /themes/
Disallow /tmp/
Disallow /update/
Disallow /updater/
Disallow /var/
Disallow /vendor/
Disallow /wp-admin/
Disallow /wp-includes/

Other Records

Field Value
crawl-delay 20

*

Rule Path
Disallow /

Comments

  • Dynamisch generierte robots.txt zur optimierten Sicherheit und Steuerung des Bot-Traffics
  • Empfehlung zur Auslieferung einer robots.txt mit individuellen Inhalten:
  • 1. Leere robots.txt im DocumentRoot der Domain erstellen
  • 2. Inhalte dieser dynamischen robots.txt mit individuellen Anpassungen einfuegen
  • Block 1: Bots in Block 1 ist das crawlen untersagt
  • Block 2: Bots duerfen unter diesen Bedingungen crawlen
  • Block 3: Bots die nicht in Block 1 + 2 genannt sind, ist das crawlen untersagt
  • Integrieren Sie an dieser Stelle die URL zu Ihrer sitemap.xml und entfernen die Raute am Anfang der Zeile
  • Sitemap: https://beispiel.de/sitemap.xml