trojaner-info.de
robots.txt

Robots Exclusion Standard data for trojaner-info.de

Resource Scan

Scan Details

Site Domain trojaner-info.de
Base Domain trojaner-info.de
Scan Status Ok
Last Scan2024-10-04T14:25:17+00:00
Next Scan 2024-10-11T14:25:17+00:00

Last Scan

Scanned2024-10-04T14:25:17+00:00
URL https://www.trojaner-info.de/robots.txt
Domain IPs 213.203.219.151
Response IP 213.203.219.151
Found Yes
Hash 82221b2fa03af8ae1a4b0654c49192fbdd46d94ebb405bd8b85d0f56effca9bd
SimHash 8338517b7ea1

Groups

k2spider

Rule Path
Disallow /

npbot

Rule Path
Disallow /

webreaper

Rule Path
Disallow /

msiecrawler

Rule Path
Disallow /

mlbot

Rule Path
Disallow /

pixray-seeker

Rule Path
Disallow /

vegi bot

Rule Path
Disallow /

flamingo_searchengine

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

semrushbot-sa

Rule Path
Disallow /

linkdexbot

Rule Path
Disallow /

seoscanners

Rule Path
Disallow /

hybridbot

Rule Path
Disallow /

proximic

Rule Path
Disallow /

sophora

Rule Path
Disallow /

qwantify

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

*

Rule Path
Disallow /check/
Disallow /contao/
Disallow /system/
Disallow /templates/
Disallow /vendor/
Disallow /share/index.php
Disallow /build.xml
Disallow /composer.json
Disallow /composer.lock
Disallow /README.md
Disallow /daten/
Allow /system/cron/cron.txt
Allow /system/modules/*/assets/
Allow /system/modules/*/html/

Other Records

Field Value
crawl-delay 20

Comments

  • robots.txt fuer http://www.trojaner-info.de
  • Doesn't follow robots.txt anyway, but...
  • Hits many times per second, not acceptable
  • http://www.nameprotect.com/botinfo.html
  • A capture bot, downloads gazillions of pages with no public benefit
  • http://www.webreaper.net/