g-ha-web.de
robots.txt

Robots Exclusion Standard data for g-ha-web.de

Resource Scan

Scan Details

Site Domain g-ha-web.de
Base Domain g-ha-web.de
Scan Status Ok
Last Scan2024-04-30T07:47:00+00:00
Next Scan 2024-05-07T07:47:00+00:00

Last Scan

Scanned2024-04-30T07:47:00+00:00
URL http://www.g-ha-web.de/robots.txt
Redirect https://web.de/robots.txt
Redirect Domain web.de
Redirect Base web.de
Domain IPs 82.165.229.83
Redirect IPs 82.165.229.138, 82.165.229.83
Response IP 82.165.229.83
Found Yes
Hash 11c690999df8cd0118479fd5d248789198c328f956365d9e17cec0d1b3dbe9b3
SimHash e9508b206533

Groups

*

Rule Path
Disallow /test/

googlebot-news

Rule Path
Disallow /
Disallow /magazine/*/thema/
Allow /magazine/
Allow /amp/
Allow /$

chatgpt-user

Rule Path
Disallow /magazine/
Allow /magazine/in-eigener-sache/
Allow /magazine/unicef/
Allow /magazine/so-arbeitet-die-redaktion/

gptbot

Rule Path
Disallow /magazine/
Allow /magazine/in-eigener-sache/
Allow /magazine/unicef/
Allow /magazine/so-arbeitet-die-redaktion/

google-extended

Rule Path
Disallow /magazine/
Allow /magazine/in-eigener-sache/
Allow /magazine/unicef/
Allow /magazine/so-arbeitet-die-redaktion/

Comments

  • https://web.de/robots.txt