zak.de
robots.txt

Robots Exclusion Standard data for zak.de

Resource Scan

Scan Details

Site Domain zak.de
Base Domain zak.de
Scan Status Ok
Last Scan2024-06-29T18:24:36+00:00
Next Scan 2024-07-06T18:24:36+00:00

Last Scan

Scanned2024-06-29T18:24:36+00:00
URL https://zak.de/robots.txt
Redirect https://www.zak.de/robots.txt
Redirect Domain www.zak.de
Redirect Base zak.de
Domain IPs 54.36.43.53
Redirect IPs 54.36.43.53
Response IP 54.36.43.53
Found Yes
Hash dddab7938bdc4ad3ae19d45a8a73481b728cc1b961271d9bf8a00e37847ef9dd
SimHash 33695d50c70c

Groups

*

Rule Path
Disallow /User
Disallow /Dateien
Disallow /Nachrichten/Suche
Disallow /ScriptResource
Disallow /WebResource
Disallow /Verlag/Digital-Smile-Abo
Disallow /Verlag/Digital-Zusatz-Abo-zu-Print-Einzelausgabe
Disallow /Nachrichten/Die-digitalen-Abomodelle-unserer-Zeitung-im-Ueberblick-70804.html
Disallow /Abo/Abo-bestellen-Detailansicht.html
Disallow /Verlag/NoAbo
Disallow /Api/Custom/Subscription

Other Records

Field Value
crawl-delay 2

Comments

  • Robots.txt for crawler
  • Disallow Crawler
  • Crawler often creates invalid script/webresource resource request
  • Max crawler Time per page in sec
  • Sitemap
  • Sitemap: https://www.zak.de/Sitemap_Index.xml.gz