aktuellenachrichten.eu
robots.txt

Robots Exclusion Standard data for aktuellenachrichten.eu

Archived Snapshots

Resource Scan

Scan Details

Site Domain	aktuellenachrichten.eu
Base Domain	aktuellenachrichten.eu
Scan Status	Ok
Last Scan	2024-11-11T20:46:54+00:00
Next Scan	2024-11-18T20:46:54+00:00

Last Scan

Scanned	2024-11-11T20:46:54+00:00
URL	https://aktuellenachrichten.eu/robots.txt
Domain IPs	104.21.79.73, 172.67.169.57, 2606:4700:3033::6815:4f49, 2606:4700:3036::ac43:a939
Response IP	172.67.169.57
Found	Yes
Hash	634ca4accc18f11bf7e5cb29fa09181398afab91066b8225a6741f5f06af70fd
SimHash	cb425542e91b

Groups

*
ninjabot

Rule	Path
Allow	/

Rule

Path

Allow

mediapartners-google*

Rule	Path
Allow	/

Rule

Path

Allow

googlebot-image

Rule	Path
Allow	/site/uploads/

Rule

Path

Allow

/site/uploads/

adsbot-google

Rule	Path
Allow	/

Rule

Path

Allow

googlebot-mobile

Rule	Path
Allow	/

Rule

Path

Allow

mj12bot

Rule	Path
Disallow	/

Rule

Path

Disallow

dotbot

Rule	Path
Disallow	/

Rule

Path

Disallow

alexibot

Rule	Path
Disallow	/

Rule

Path

Disallow

surveybot

Rule	Path
Disallow	/

Rule

Path

Disallow

xenuâs

Rule	Path
Disallow	/

Rule

Path

Disallow

xenuâs link sleuth 1.1c

Rule	Path
Disallow	/

Rule

Path

Disallow

rogerbot

Rule	Path
Disallow	/

Rule

Path

Disallow

nextgensearchbot

Rule	Path
Disallow	/

Rule

Path

Disallow

ia_archiver

Rule	Path
Disallow	/

Rule

Path

Disallow

archive.org_bot

Rule	Path
Disallow	/

Rule

Path

Disallow

archive.org bot

Rule	Path
Disallow	/

Rule

Path

Disallow

linkwalker

Rule	Path
Disallow	/

Rule

Path

Disallow

gigablast spider

Rule	Path
Disallow	/

Rule

Path

Disallow

ia_archiver-web.archive.org

Rule	Path
Disallow	/

Rule

Path

Disallow

picscout

Rule	Path
Disallow	/

Rule

Path

Disallow

blexbot crawler

Rule	Path
Disallow	/

Rule

Path

Disallow

tineye

Rule	Path
Disallow	/

Rule

Path

Disallow

seokicks-robot

Rule	Path
Disallow	/

Rule

Path

Disallow

blexbot

Rule	Path
Disallow	/

Rule

Path

Disallow

sistrix crawler

Rule	Path
Disallow	/

Rule

Path

Disallow

uptimerobot/2.0

Rule	Path
Disallow	/

Rule

Path

Disallow

ezooms robot

Rule	Path
Disallow	/

Rule

Path

Disallow

netestate ne crawler (+http://www.website-datenbank.de/)

Rule	Path
Disallow	/

Rule

Path

Disallow

wiseguys robot

Rule	Path
Disallow	/

Rule

Path

Disallow

turnitin robot

Rule	Path
Disallow	/

Rule

Path

Disallow

heritrix

Rule	Path
Disallow	/

Rule

Path

Disallow

pimonster

Rule	Path
Disallow	/

Rule

Path

Disallow

pimonster

Rule	Path
Disallow	/

Rule

Path

Disallow

pi-monster

Rule	Path
Disallow	/

Rule

Path

Disallow

eccp/1.0 (search@eniro.com)

Rule	Path
Disallow	/

Rule

Path

Disallow

psbot

Rule	Path
Disallow	/

Rule

Path

Disallow

youdaobot

Rule	Path
Disallow	/

Rule

Path

Disallow

blexbot

Rule	Path
Disallow	/

Rule

Path

Disallow

naverbot
yeti

Rule	Path
Disallow	/

Rule

Path

Disallow

zbot

Rule	Path
Disallow	/

Rule

Path

Disallow

vagabondo

Rule	Path
Disallow	/

Rule

Path

Disallow

linkwalker

Rule	Path
Disallow	/

Rule

Path

Disallow

simplepie

Rule	Path
Disallow	/

Rule

Path

Disallow

wget

Rule	Path
Disallow	/

Rule

Path

Disallow

pixray-seeker

Rule	Path
Disallow	/

Rule

Path

Disallow

boardreader

Rule	Path
Disallow	/

Rule

Path

Disallow

quantify

Rule	Path
Disallow	/

Rule

Path

Disallow

plukkie

Rule	Path
Disallow	/

Rule

Path

Disallow

cuam

Rule

Path

Disallow

megaindex.ru

Rule

Path

Disallow

megaindex.com

Rule

Path

Disallow

megaindex.ru/2.0

Rule

Path

Disallow

megaindex.ru

Rule

Path

Disallow

Other Records

Field

Value

sitemap

https://madlime.com/sitemap.xml

Comments

Sitemap Files
Block NextGenSearchBot
Block ia-archiver from crawling site
Block archive.org_bot from crawling site
Block Archive.org Bot from crawling site
Block LinkWalker from crawling site
Block GigaBlast Spider from crawling site
Block ia_archiver-web.archive.org_bot from crawling site
Block PicScout Crawler from crawling site
Block BLEXBot Crawler from crawling site
Block TinEye from crawling site
Block SEOkicks
Block BlexBot
Block SISTRIX
Block Uptime robot
Block Ezooms Robot
Block netEstate NE Crawler (+http://www.website-datenbank.de/)
Block WiseGuys Robot
Block Turnitin Robot
Block Heritrix
Block pricepi
Block Eniro
Block Psbot
Block Youdao
BLEXBot
Block NaverBot
Block ZBot
Block Vagabondo
Block LinkWalker
Block SimplePie
Block Wget
Block Pixray-Seeker
Block BoardReader
Block Quantify
Block Plukkie
Block Cuam
https://megaindex.com/crawler

Warnings

2 invalid lines.

aktuellenachrichten.eurobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*ninjabot

mediapartners-google*

googlebot-image

adsbot-google

googlebot-mobile

mj12bot

dotbot

alexibot

surveybot

xenuâs

xenuâs link sleuth 1.1c

rogerbot

nextgensearchbot

ia_archiver

archive.org_bot

archive.org bot

linkwalker

gigablast spider

ia_archiver-web.archive.org

picscout

blexbot crawler

tineye

seokicks-robot

blexbot

sistrix crawler

uptimerobot/2.0

ezooms robot

netestate ne crawler (+http://www.website-datenbank.de/)

wiseguys robot

turnitin robot

heritrix

pimonster

pimonster

pi-monster

eccp/1.0 (search@eniro.com)

psbot

youdaobot

blexbot

naverbotyeti

zbot

vagabondo

linkwalker

simplepie

wget

pixray-seeker

boardreader

quantify

plukkie

cuam

megaindex.ru

megaindex.com

megaindex.ru/2.0

megaindex.ru

Other Records

Comments

Warnings

aktuellenachrichten.eu
robots.txt

*
ninjabot

xenuâs

xenuâs link sleuth 1.1c

naverbot
yeti