taz-bremen.de
robots.txt

Robots Exclusion Standard data for taz-bremen.de

Archived Snapshots

Resource Scan

Scan Details

Site Domain	taz-bremen.de
Base Domain	taz-bremen.de
Scan Status	Ok
Last Scan	2024-11-09T08:24:10+00:00
Next Scan	2024-11-16T08:24:10+00:00

Last Scan

Scanned	2024-11-09T08:24:10+00:00
URL	https://taz-bremen.de/robots.txt
Redirect	https://taz.de/robots.txt
Redirect Domain	taz.de
Redirect Base	taz.de
Domain IPs	193.104.220.23
Redirect IPs	193.104.220.23, 2001:67c:13c::7a2:de
Response IP	193.104.220.23
Found	Yes
Hash	d90390ae043e9c37c8217e9188ee651329867f568c2dec81c3eb87ddc5305f48
SimHash	70200d58c99d

Groups

slurp

No rules defined. All paths allowed.

Other Records

Field	Value
crawl-delay	60

Field

Value

crawl-delay

60

*

Rule	Path
Disallow	/openads
Disallow	/48f6543196cbbdefb88f247a0a8e4375

Rule

Path

Disallow

/openads

Disallow

/48f6543196cbbdefb88f247a0a8e4375

gptbot

Rule	Path
Disallow	/
Disallow	/

Rule

Path

Disallow

/

Disallow

/

Back to top

Other Records

Field	Value
sitemap	https://taz.de/sitemap-google-news.xml
sitemap	https://taz.de/sitemap-index.xml

Field

Value

sitemap

https://taz.de/sitemap-google-news.xml

sitemap

https://taz.de/sitemap-index.xml

Back to top

Comments

as per https://platform.openai.com/docs/gptbot
Legal notice: taz.de expressly reserves the right to use its content for commercial text and data mining (§ 44 b UrhG).
The use of robots or other automated means to access taz.de or collect or mine data without the express permission of taz.de is strictly prohibited.
taz.de may, in its discretion, permit certain automated access to certain taz.de pages.
If you would like to apply for permission to crawl taz.de, collect or use data, please email lizenzen@taz.de.

Back to top

Warnings

`useragent` is not a known field.

Back to top

taz-bremen.derobots.txt

Resource Scan

Scan Details

Last Scan

Groups

slurp

Other Records

*

gptbot

Other Records

Comments

Warnings

taz-bremen.de
robots.txt