derstandard.de
robots.txt

Robots Exclusion Standard data for derstandard.de

Resource Scan

Scan Details

Site Domain derstandard.de
Base Domain derstandard.de
Scan Status Ok
Last Scan2024-11-09T16:25:29+00:00
Next Scan 2024-11-16T16:25:29+00:00

Last Scan

Scanned2024-11-09T16:25:29+00:00
URL https://derstandard.de/robots.txt
Redirect https://www.derstandard.de/robots.txt
Redirect Domain www.derstandard.de
Redirect Base derstandard.de
Domain IPs 194.116.243.40
Redirect IPs 23.50.89.39, 2600:1413:b000:887::32ac, 2600:1413:b000:88b::32ac
Response IP 104.69.47.49
Found Yes
Hash c9fff50916f43a9b8829a6f8147b2d46fe19b17f6f355a96135a83e8a2baf5d5
SimHash 6396d852a786

Groups

*

Rule Path
Disallow /profil/

anthropic-ai

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

ia_archiver

Rule Path
Disallow /

omgili

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

Other Records

Field Value
crawl-delay 1

Other Records

Field Value
sitemap https://www.derstandard.at/sitemaps/news.xml
sitemap https://www.derstandard.at/sitemaps/sitemap.xml