harzkurier.de
robots.txt

Robots Exclusion Standard data for harzkurier.de

Resource Scan

Scan Details

Site Domain harzkurier.de
Base Domain harzkurier.de
Scan Status Ok
Last Scan2024-11-11T11:16:58+00:00
Next Scan 2024-11-18T11:16:58+00:00

Last Scan

Scanned2024-11-11T11:16:58+00:00
URL https://harzkurier.de/robots.txt
Redirect https://www.harzkurier.de:443/robots.txt
Redirect Domain www.harzkurier.de
Redirect Base harzkurier.de
Domain IPs 18.185.81.127, 18.196.221.37, 3.72.121.83
Redirect IPs 2600:9000:2721:1200:5:51df:84c0:93a1, 2600:9000:2721:2a00:5:51df:84c0:93a1, 2600:9000:2721:4800:5:51df:84c0:93a1, 2600:9000:2721:6c00:5:51df:84c0:93a1, 2600:9000:2721:c00:5:51df:84c0:93a1, 2600:9000:2721:d000:5:51df:84c0:93a1, 2600:9000:2721:e200:5:51df:84c0:93a1, 2600:9000:2721:e800:5:51df:84c0:93a1, 3.165.102.107, 3.165.102.37, 3.165.102.5, 3.165.102.96
Response IP 3.165.102.107
Found Yes
Hash 0f42d2bb101e659607eaad0d4389fe898f248a4f5dc1e4dbaaf184ab4bc13600
SimHash 5c1b9052c621

Groups

*

Rule Path
Allow /static/*/client.js
Allow /static/*/main.css
Allow /static/*/favicon.png
Disallow /stats/*
Disallow /*?config*
Disallow /*.xmli*
Disallow /*?service=Ajax
Disallow /*?service=ajax
Disallow /config/*
Disallow /test/*
Disallow /Test/*
Disallow /template/*
Disallow /*?*token=*
Disallow /*?*eventId=*
Disallow /static/*
Disallow /migration_import_no_section/*
Disallow /secure/
Disallow /socialmedia/*
Disallow *reader_id%3DREADER_ID*
Disallow /suche/*
Disallow /*?widgetid=
Disallow /newsletter-result/
Disallow *tpcc%3D*
Disallow /resources/
Disallow /bin/
Disallow /downloads/
Disallow /service/newsletter-adconsent
Disallow /pagespeed_static/
Disallow /resources/img/*icon*pagespeed

cliqzbot
baiduspider
sogou spider
baiduspider
flamingo_searchengine
seznambot
yandex

Rule Path
Disallow /

semrushbot-sa
ahrefsbot
backlinkcrawler
linkchecker
dataforseobot
deepcrawl
majestic
majestic12
mj12bot
onpagebot
optimizer
rytebot
semrushbot
semrushbot-si
seobility
seodiver
seokicks
seokicks-robot
sistrix
openindexspider
openindexspider
sistrix optimizer
sistrix
sistrix crawler
siteauditbot

Rule Path
Disallow /

amazonbot
anthropic-ai
applebot-extended
archive.org_bot
bytespider
ccbot
chatgpt-user
claudebot
claude-web
cohere-ai
diffbot
facebookbot
friendlycrawler
google-extended
googleother
gptbot
ia_archiver
img2dataset
omgili
omgilibot
peer39_crawler
peer39_crawler/1.0
perplexitybot
youbot
meta-externalagent
imagesiftbot

Rule Path
Disallow /

arquivo-web-crawler
arquivo.pt
barkrowler
blexbot
browsertrix
brozzler
builtwith
cincraw
coccocbot
contao/crawler
dmbot
domainstatsbot
dotbot
dotbot
fluid
haosouspider
happywing
harsilbot
hatena antenna
heritrix
imagesiftbot
kazbtbot
kraken
linkdebot
linkfluence yak bot
mail.ru_bot
metajobbot
monsidobot
netestate
ogdwctcxcrawler
petalbot
researchbot
riddler
sentibot
rogerbot
semanticbot
semanticscholarbot
sirdatabot
spbot
special_archiver
splitsignalbot
tag-crawler
testcrawler
thinkers-bot
toplistbot
uipbot/1.0
urlsuma
user-agent
vsusearchspider
weborama-fetcher
wiseguys robot
wpbot
yeti

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.harzkurier.de/sitemaps/news.xml

Comments

  • search engines
  • seo tools
  • ai tools
  • other