kreiszeitung-wesermarsch.de
robots.txt

Robots Exclusion Standard data for kreiszeitung-wesermarsch.de

Resource Scan

Scan Details

Site Domain kreiszeitung-wesermarsch.de
Base Domain kreiszeitung-wesermarsch.de
Scan Status Ok
Last Scan2024-09-28T03:11:08+00:00
Next Scan 2024-10-05T03:11:08+00:00

Last Scan

Scanned2024-09-28T03:11:08+00:00
URL https://www.kreiszeitung-wesermarsch.de/robots.txt
Domain IPs 217.182.184.202
Response IP 217.182.184.202
Found Yes
Hash dd5b4f9f0a22c0af334736aaa3ea42ab6276b4033392758e39fbd99f31123cd6
SimHash 3360551049af

Groups

*

Rule Path
Disallow /User
Disallow /Dateien
Disallow /Nachrichten/Suche
Disallow /*?bPrint=true*
Disallow /Nachrichten/Archiv-Liste_*.html
Disallow /*/Zusammengefasst-*.html
Disallow /Zusammengefasst-*.html
Disallow /ScriptResource
Disallow /WebResource

backlink-check.de

Rule Path
Disallow /

backlinkcrawler

Rule Path
Disallow /

extractorpro

Rule Path
Disallow /

fasterfox

Rule Path
Disallow /

linkextractorpro

Rule Path
Disallow /

linkwalker

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

openbot

Rule Path
Disallow /

rogerbot

Rule Path
Disallow /

searchpreview

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

seodat

Rule Path
Disallow /

seoengbot

Rule Path
Disallow /

seokicks-robot

Rule Path
Disallow /

true_robot

Rule Path
Disallow /

url control

Rule Path
Disallow /

url_spider_pro

Rule Path
Disallow /

xovi

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

um-ic

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

Other Records

Field Value
crawl-delay 2

Other Records

Field Value
sitemap https://www.kreiszeitung-wesermarsch.de/Sitemap_Index.xml.gz

Comments

  • Robots.txt for crawler
  • Disallow Crawler
  • Crawler often creates invalid script/webresource resource request
  • Uber Metrics
  • Max crawler Time per page in sec
  • Sitemap
  • Legal notice: kreiszeitung-wesermarsch.de expressly reserves the right to use its content for commercial text and data mining (� 44 b UrhG).
  • The use of robots or other automated means to access kreiszeitung-wesermarsch.de or collect or mine data without
  • the express permission of kreiszeitung-wesermarsch.de is strictly prohibited.
  • kreiszeitung-wesermarsch.de may, in its discretion, permit certain automated access to certain kreiszeitung-wesermarsch.de pages,
  • If you would like to apply for permission to crawl kreiszeitung-wesermarsch.de, collect or use data, please email syndication@nordmediagruppe.de