tagesschau.de
robots.txt

Robots Exclusion Standard data for tagesschau.de

Resource Scan

Scan Details

Site Domain tagesschau.de
Base Domain tagesschau.de
Scan Status Ok
Last Scan2024-06-03T10:21:56+00:00
Next Scan 2024-06-17T10:21:56+00:00

Last Scan

Scanned2024-06-03T10:21:56+00:00
URL https://tagesschau.de/robots.txt
Redirect https://www.tagesschau.de:443/robots.txt
Redirect Domain www.tagesschau.de
Redirect Base tagesschau.de
Domain IPs 2600:1901:0:1b60::, 34.110.152.241
Redirect IPs 23.205.189.169, 2a02:26f0:d200:3a9::1ff2, 2a02:26f0:d200:3ad::1ff2
Response IP 23.54.136.223
Found Yes
Hash 1cb56f163c2a1cc6c9795e7101fc8e253bca8c6078e906789c65cf73616d3ecf
SimHash 895ed874c3d2

Groups

*

Rule Path
Disallow /ardimport/
Disallow /*~player.html
Disallow /suche2.html?*

webzip

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

webwhacker

Rule Path
Disallow /

websauger

Rule Path
Disallow /

webcapture

Rule Path
Disallow /

teleport

Rule Path
Disallow /

sitesnagger

Rule Path
Disallow /

net attache

Rule Path
Disallow /

httrack

Rule Path
Disallow /