aerzteblatt.de
robots.txt

Robots Exclusion Standard data for aerzteblatt.de

Resource Scan

Scan Details

Site Domain aerzteblatt.de
Base Domain aerzteblatt.de
Scan Status Ok
Last Scan2024-04-30T23:43:13+00:00
Next Scan 2024-05-07T23:43:13+00:00

Last Scan

Scanned2024-04-30T23:43:13+00:00
URL https://aerzteblatt.de/robots.txt
Redirect https://www.aerzteblatt.de/robots.txt
Redirect Domain www.aerzteblatt.de
Redirect Base aerzteblatt.de
Domain IPs 35.157.125.147
Redirect IPs 35.157.125.147
Response IP 35.157.125.147
Found Yes
Hash 132964a4788b6de122cd830e2ad89aef9322c24e8f1c2df0fb8e3f58305d5cfe
SimHash 623014324d7f

Groups

*

Rule Path
Disallow /nachrichten/*?*page=*
Disallow /archiv/*?*page=*
Disallow /werbung/click.asp*
Disallow /*.pdf
Disallow /old/*
Disallow /cms/*
Disallow /intern/*
Disallow /callback/*
Disallow /suche*
Disallow /treffer*
Disallow /archiv/suche*
Disallow /archiv/treffer*

Other Records

Field Value
crawl-delay 1

chatgpt-user

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

ccbot/1.0

Rule Path
Disallow /

ccbot/2.0

Rule Path
Disallow /

ccbot/3.0

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.aerzteblatt.de/sitemaps/dae.xml

Comments

  • OpenAI ChatGPT
  • Google Bard
  • Common Crawl
  • Legal notice: aerzteblatt.de expressly reserves the right to use its content for commercial text and data mining (ยง 44 b UrhG).
  • The use of robots or other automated means to access aerzteblatt.de or collect or mine data without
  • the express permission of aerzteblatt.de is strictly prohibited.
  • aerzteblatt.de may, in its discretion, permit certain automated access to certain aerzteblatt.de pages.
  • If you would like to apply for permission to crawl aerzteblatt.de, collect or use data, please email aerzteblatt@aerzteblatt.de