jungewelt.de
robots.txt

Robots Exclusion Standard data for jungewelt.de

Archived Snapshots

Resource Scan

Scan Details

Site Domain	jungewelt.de
Base Domain	jungewelt.de
Scan Status	Ok
Last Scan	2025-04-26T06:14:16+00:00
Next Scan	2025-05-26T06:14:16+00:00

Last Scan

Scanned	2025-04-26T06:14:16+00:00
URL	https://jungewelt.de/robots.txt
Domain IPs	212.222.128.119
Response IP	212.222.128.119
Found	Yes
Hash	ebf34b4d8e5a5334cd41e9a658ac6441cc3c69ac19d63e3a082a78b7e6fb4d54
SimHash	2800fee2cfd2

Groups

*

Rule	Path
Disallow	/artikel/print.php
Disallow	/comment.php
Disallow	/leserbrief/
Disallow	/user/
Disallow	/video/
Disallow	/bannercount.php
Disallow	/bannercountsky.php
Disallow	/artikel/mail.php

Rule

Path

Disallow

/artikel/print.php

Disallow

/comment.php

Disallow

/leserbrief/

Disallow

/user/

Disallow

/video/

Disallow

/bannercount.php

Disallow

/bannercountsky.php

Disallow

/artikel/mail.php

yahoo-newscrawler

Rule	Path
Disallow

Rule

Path

Disallow

fast-webcrawler

Rule	Path
Disallow	/

Rule

Path

Disallow

/

freefind

Rule	Path
Disallow	/

Rule

Path

Disallow

/

excalibur internet spider

Rule	Path
Disallow	/

Rule

Path

Disallow

/

wget

Rule	Path
Disallow	/

Rule

Path

Disallow

/

Other Records

Field	Value
crawl-delay	1

Field

Value

crawl-delay

1

Back to top

Other Records

Field	Value
sitemap	https://www.jungewelt.de/google-sitemap/index.xml

Field

Value

sitemap

https://www.jungewelt.de/google-sitemap/index.xml

Back to top

Comments

Alle robots
Extrawurst fuer Yahoo
Miese Bots raus

Back to top

jungewelt.derobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

yahoo-newscrawler

fast-webcrawler

freefind

excalibur internet spider

wget

Other Records

Other Records

Comments

jungewelt.de
robots.txt