jungewelt.de
robots.txt

Robots Exclusion Standard data for jungewelt.de

Resource Scan

Scan Details

Site Domain jungewelt.de
Base Domain jungewelt.de
Scan Status Ok
Last Scan2025-04-26T06:14:16+00:00
Next Scan 2025-05-26T06:14:16+00:00

Last Scan

Scanned2025-04-26T06:14:16+00:00
URL https://jungewelt.de/robots.txt
Domain IPs 212.222.128.119
Response IP 212.222.128.119
Found Yes
Hash ebf34b4d8e5a5334cd41e9a658ac6441cc3c69ac19d63e3a082a78b7e6fb4d54
SimHash 2800fee2cfd2

Groups

*

Rule Path
Disallow /artikel/print.php
Disallow /comment.php
Disallow /leserbrief/
Disallow /user/
Disallow /video/
Disallow /bannercount.php
Disallow /bannercountsky.php
Disallow /artikel/mail.php

yahoo-newscrawler

Rule Path
Disallow

fast-webcrawler

Rule Path
Disallow /

freefind

Rule Path
Disallow /

excalibur internet spider

Rule Path
Disallow /

wget

Rule Path
Disallow /

Other Records

Field Value
crawl-delay 1

Other Records

Field Value
sitemap https://www.jungewelt.de/google-sitemap/index.xml

Comments

  • Alle robots
  • Extrawurst fuer Yahoo
  • Miese Bots raus