ditzen.de
robots.txt

Robots Exclusion Standard data for ditzen.de

Resource Scan

Scan Details

Site Domain ditzen.de
Base Domain ditzen.de
Scan Status Ok
Last Scan2024-06-28T07:22:07+00:00
Next Scan 2024-07-05T07:22:07+00:00

Last Scan

Scanned2024-06-28T07:22:07+00:00
URL http://ditzen.de/robots.txt
Redirect https://www.nordsee-zeitung.de/robots.txt
Redirect Domain www.nordsee-zeitung.de
Redirect Base nordsee-zeitung.de
Domain IPs 89.31.143.90
Redirect IPs 217.182.184.202
Response IP 217.182.184.202
Found Yes
Hash a0df0043e8e7acdd1bebd926487c07157d39ebd7bd3872451afead4128977c0f
SimHash 3364551049af

Groups

*

Rule Path
Disallow /User
Disallow /Dateien
Disallow /Nachrichten/Suche
Disallow /*?bPrint=true*
Disallow /Nachrichten/Archiv-Liste_*.html
Disallow /*/Zusammengefasst-*.html
Disallow /Zusammengefasst-*.html
Disallow /ScriptResource
Disallow /WebResource

backlink-check.de

Rule Path
Disallow /

backlinkcrawler

Rule Path
Disallow /

extractorpro

Rule Path
Disallow /

fasterfox

Rule Path
Disallow /

linkextractorpro

Rule Path
Disallow /

linkwalker

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

openbot

Rule Path
Disallow /

rogerbot

Rule Path
Disallow /

searchpreview

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

seodat

Rule Path
Disallow /

seoengbot

Rule Path
Disallow /

seokicks-robot

Rule Path
Disallow /

true_robot

Rule Path
Disallow /

url control

Rule Path
Disallow /

url_spider_pro

Rule Path
Disallow /

xovi

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

um-ic

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

Other Records

Field Value
crawl-delay 2

Other Records

Field Value
sitemap https://www.nordsee-zeitung.de/Sitemap_Index.xml.gz

Comments

  • Robots.txt for crawler
  • Disallow Crawler
  • Crawler often creates invalid script/webresource resource request
  • Uber Metrics
  • Max crawler Time per page in sec
  • Sitemap
  • Legal notice: nordsee-zeitung.de expressly reserves the right to use its content for commercial text and data mining (� 44 b UrhG).
  • The use of robots or other automated means to access nordsee-zeitung.de or collect or mine data without
  • the express permission of nordsee-zeitung.de is strictly prohibited.
  • nordsee-zeitung.de may, in its discretion, permit certain automated access to certain nordsee-zeitung.de pages,
  • If you would like to apply for permission to crawl nordsee-zeitung.de, collect or use data, please email syndication@nordmediagruppe.de