swr.de
robots.txt

Robots Exclusion Standard data for swr.de

Resource Scan

Scan Details

Site Domain swr.de
Base Domain swr.de
Scan Status Ok
Last Scan2025-07-27T01:39:50+00:00
Next Scan 2025-08-10T01:39:50+00:00

Last Scan

Scanned2025-07-27T01:39:50+00:00
URL https://swr.de/robots.txt
Redirect https://www.swr.de:443/robots.txt
Redirect Domain www.swr.de
Redirect Base swr.de
Domain IPs 34.120.237.106
Redirect IPs 104.83.87.163, 2a02:26f0:9c00:1aa::3121, 2a02:26f0:9c00:1b9::3121
Response IP 104.83.87.163
Found Yes
Hash 016333a87807b87e657ff7071144fb054d531bd366017ca2e191d9906d74f0a7
SimHash 7b054200a6d4

Groups

*

Rule Path
Disallow /api/
Disallow /reactions/
Disallow /search/suggest/
Disallow /cms/
Disallow /*~_currentSlide-*
Disallow /*?_pjax=%23content
Disallow /*?_pjax=%23fragment

amazonbot

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

claude-user

Rule Path
Disallow /

claude-searchbot

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

applebot-extended

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

tiktokspider

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

cohere-training-data-crawler

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

diffbot

Rule Path
Disallow /

duckassistbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

petalbot

Rule Path
Disallow /

pangubot

Rule Path
Disallow /

meta-externalagent

Rule Path
Disallow /

meta-externalfetcher

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /

mistralai-user

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

oai-searchbot

Rule Path
Disallow /

chatgpt-user/2.0

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

perplexity-user

Rule Path
Disallow /

omgili

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

webzio-extended

Rule Path
Disallow /

youbot

Rule Path
Disallow /

scrapy

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.swr.de/~sitemap/index.xml
sitemap https://www.swr.de/~sitemap/swraktuell/index.xml
sitemap https://www.swr.de/~sitemap/sport/index.xml

Comments

  • robots.txt für SWR.de
  • Stand: 2024-02-08 10:15 CET
  • Disallow
  • Nutzungsvorbehalt KI (siehe https://gitlab.ard.de/modul-12-seo/nutzungsvorbehalt/-/raw/master/robots.txt)
  • Amazon
  • Anthropic
  • Apple
  • ByteDance
  • Cohere
  • Common Crawl
  • Diffbot
  • DuckDuckGo
  • Google
  • Huawei
  • Meta
  • Mistral
  • OpenAI
  • Perplexity
  • Webz.io
  • You.com
  • Zyte
  • Sitemaps
  • Generische Google-Sitemap
  • News-Sitemap für SWR Aktuell
  • News-Sitemap für SWR Sport

Warnings

  • `host` is not a known field.