web.de
robots.txt

Robots Exclusion Standard data for web.de

Resource Scan

Scan Details

Site Domain web.de
Base Domain web.de
Scan Status Ok
Last Scan2024-04-25T19:21:50+00:00
Next Scan 2024-05-02T19:21:50+00:00

Last Scan

Scanned2024-04-25T19:21:50+00:00
URL https://web.de/robots.txt
Domain IPs 82.165.229.138, 82.165.229.83
Response IP 82.165.229.83
Found Yes
Hash 11c690999df8cd0118479fd5d248789198c328f956365d9e17cec0d1b3dbe9b3
SimHash e9508b206533

Groups

*

Rule Path
Disallow /test/

googlebot-news

Rule Path
Disallow /
Disallow /magazine/*/thema/
Allow /magazine/
Allow /amp/
Allow /$

chatgpt-user

Rule Path
Disallow /magazine/
Allow /magazine/in-eigener-sache/
Allow /magazine/unicef/
Allow /magazine/so-arbeitet-die-redaktion/

gptbot

Rule Path
Disallow /magazine/
Allow /magazine/in-eigener-sache/
Allow /magazine/unicef/
Allow /magazine/so-arbeitet-die-redaktion/

google-extended

Rule Path
Disallow /magazine/
Allow /magazine/in-eigener-sache/
Allow /magazine/unicef/
Allow /magazine/so-arbeitet-die-redaktion/

Comments

  • https://web.de/robots.txt