trojaner-info.de
robots.txt

Robots Exclusion Standard data for trojaner-info.de

Archived Snapshots

Resource Scan

Scan Details

Site Domain	trojaner-info.de
Base Domain	trojaner-info.de
Scan Status	Ok
Last Scan	2024-10-04T14:25:17+00:00
Next Scan	2024-10-11T14:25:17+00:00

Last Scan

Scanned	2024-10-04T14:25:17+00:00
URL	https://www.trojaner-info.de/robots.txt
Domain IPs	213.203.219.151
Response IP	213.203.219.151
Found	Yes
Hash	82221b2fa03af8ae1a4b0654c49192fbdd46d94ebb405bd8b85d0f56effca9bd
SimHash	8338517b7ea1

Groups

k2spider

Rule	Path
Disallow	/

Rule

Path

Disallow

npbot

Rule	Path
Disallow	/

Rule

Path

Disallow

webreaper

Rule	Path
Disallow	/

Rule

Path

Disallow

msiecrawler

Rule	Path
Disallow	/

Rule

Path

Disallow

mlbot

Rule	Path
Disallow	/

Rule

Path

Disallow

pixray-seeker

Rule	Path
Disallow	/

Rule

Path

Disallow

vegi bot

Rule	Path
Disallow	/

Rule

Path

Disallow

flamingo_searchengine

Rule	Path
Disallow	/

Rule

Path

Disallow

ahrefsbot

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot-sa

Rule	Path
Disallow	/

Rule

Path

Disallow

linkdexbot

Rule	Path
Disallow	/

Rule

Path

Disallow

seoscanners

Rule	Path
Disallow	/

Rule

Path

Disallow

hybridbot

Rule	Path
Disallow	/

Rule

Path

Disallow

proximic

Rule	Path
Disallow	/

Rule

Path

Disallow

sophora

Rule	Path
Disallow	/

Rule

Path

Disallow

qwantify

Rule	Path
Disallow	/

Rule

Path

Disallow

blexbot

Rule	Path
Disallow	/

Rule

Path

Disallow

*

Rule	Path
Disallow	/check/
Disallow	/contao/
Disallow	/system/
Disallow	/templates/
Disallow	/vendor/
Disallow	/share/index.php
Disallow	/build.xml
Disallow	/composer.json
Disallow	/composer.lock
Disallow	/README.md
Disallow	/daten/
Allow	/system/cron/cron.txt
Allow	/system/modules/*/assets/
Allow	/system/modules/*/html/

Rule

Path

Disallow

/check/

Disallow

/contao/

Disallow

/system/

Disallow

/templates/

Disallow

/vendor/

Disallow

/share/index.php

Disallow

/build.xml

Disallow

/composer.json

Disallow

/composer.lock

Disallow

/README.md

Disallow

/daten/

Allow

/system/cron/cron.txt

Allow

/system/modules/*/assets/

Allow

/system/modules/*/html/

Other Records

Field	Value
crawl-delay	20

Field

Value

crawl-delay

Comments

robots.txt fuer http://www.trojaner-info.de
Doesn't follow robots.txt anyway, but...
Hits many times per second, not acceptable
http://www.nameprotect.com/botinfo.html
A capture bot, downloads gazillions of pages with no public benefit
http://www.webreaper.net/

trojaner-info.derobots.txt

Resource Scan

Scan Details

Last Scan

Groups

k2spider

npbot

webreaper

msiecrawler

mlbot

pixray-seeker

vegi bot

flamingo_searchengine

ahrefsbot

semrushbot

semrushbot-sa

linkdexbot

seoscanners

hybridbot

proximic

sophora

qwantify

blexbot

*

Other Records

Comments

trojaner-info.de
robots.txt