anzeiger.net
robots.txt

Robots Exclusion Standard data for anzeiger.net

Resource Scan

Scan Details

Site Domain anzeiger.net
Base Domain anzeiger.net
Scan Status Ok
Last Scan2025-12-16T12:39:29+00:00
Next Scan 2025-12-23T12:39:29+00:00

Last Scan

Scanned2025-12-16T12:39:29+00:00
URL http://anzeiger.net/robots.txt
Redirect https://www.giessener-anzeiger.de/robots.txt
Redirect Domain www.giessener-anzeiger.de
Redirect Base giessener-anzeiger.de
Domain IPs 77.235.162.48
Redirect IPs 91.234.30.204
Response IP 91.234.30.204
Found Yes
Hash c3f0fd980271e10320fea55fd23207610d8515ed96840cbcb84640117e7c7b65
SimHash bb1a0f52acad

Groups

*

Rule Path
Disallow /sub/paywall/js/
Disallow /lightweight-ajax
Disallow /suche/
Disallow /test/
Disallow /fdn/bootstrap/
Disallow /bi/bootstrap/
Disallow /bi/doop/
Disallow /bi/dev/
Disallow /sso/
Disallow /west/assets/common/js/fallback.js

amazonbot
alibababot
ai2bot-dolma
applebot-extended
bytespider
ccbot
chatglm-spider
claudebot
claude-web
cloudvertexbot
cohere-training-data-crawler
cotoyogi
datenbank crawler
diffbot
facebookbot
google-extended
googleother
gptbot
icc-crawler
imagespider
kangaroo bot
laion-huggingface-processor
lcc
meta-externalagent
netestate imprint crawler
omgili
pangubot
perplexitybot
sbintuitionsbot
spider
timpibot
velenpublicwebcrawler
webzio-extended
youbot

Rule Path
Allow /ueber-uns/
Disallow /

dataprovider.com
dcrawl
helloworkjobpostingbot
httrack
httrack 3.0
krawlerbot
metainspector
metajobbot
newspaper
nutch
offline explorer
openindexspider
potions
scrapy
serverhunterspider
statsdronebot

Rule Path
Disallow /

aa
ahrefsbot
attracta
barkrowler
brightedge crawler
blexbot
caliberbot
caliperbot
claritybot
clearscopebot
cloudtrellis
cludo.com
cocolyzebot
cognitiveseo crawler
contentking
convermax
cxense
dataforseobot
dataforseo bot
domainstatsbot
dotbot
dragonbot
huckabot
huckabuy bot
hypestat
iaskspider
img2dataset
james bot
linkcheck by siteimprove
linkchecker bot
linkchecker.pro
linkdexbot
linksindexerbot
marketgoo
mbcrawler
megaindex.ru
mj12bot
moz dotbot
moz rogerbot
nitro-
nitrobot
online-webceo-bot
prerender
pro sitemaps
readable
revvimgort
rogerbot-crawler
rsiteauditor
searchatlas bot
sebot-wa
searchmetricsbot
semrushbot-ba
semrushbot-ct
semrushbot-ocob
semrushbot-sa
semrushbot-si
semrushbotbacklinks
senutobot
sidetrade indexer bot
seo-audit-check-bot
seo4ajax
seo4ajax.com
seobilitybot
seokicks
seolizer
serankingbacklinksbot
serpstatbot
siteauditbot
sitebulb
sitecheck-sitecrawl
sitecheckerbotcrawler
siteimprove crawl
statusnestbacklinkspider
seekport
seekportbot
sistrix
woorankreview
xovi
xovionpagecrawler
zoombot

Rule Path
Disallow /

a360-search
activecomply
aihitbot
anderspinkbot
archivebot
automattic analytics crawler
awario
awariobot
awariosmartbot
bigupdatabot
bitsightbot
blackboard
bomborabot
brightedge bot
buck
channable
checkmarknetwork
cincraw
clickagy intelligence bot v2
contextualbot
contxbot
cxensebot
ds9
ecovadissustainabilitybot
epivozcrawler
ev-crawler
ezoicbot-nicheiq
factset_spyderbot
fdl stats bot
hubspot
innguma
lightspeedsystemscrawler
linkfluence
linkwalker
macrobondbot
magpie-crawler
medialogiabot
mediamonitoringbot
mediatoolkitbot
mediavine medatada parser
missinglettr bot
mixrankbot
muckrack
netcraft
netestate ne crawler
netseer crawler
netvibes
owler
page-preview-tool
pandalytics
panscient.com
parse.ly scraper
sentibot
slickbot
slickstream
smtbot
trendictionbot
trendsmapresolver
ttd-content
turnitinbot
tweetmemebot
twingly
twingly recon-sjostrom
um-fc
um-ic
um-ln
webspidermount
webzio
yadirectfetcher
yak
yext inc
yextbot
zoominfobot
imagesiftbot

Rule Path
Allow /ueber-uns/
Disallow /

archive.org_bot
arquivo-web-crawler
authory
bl.uk_lddc_bot
bne.es_bot
bnf.fr_bot
heritrix
ia_archiver
ia_archiver-web.archive.org
iabot
internet archive
mirrorweb
netarkivindsamling
nicecrawler
special_archiver
turnitin
xy-archive-compliance

Rule Path
Allow /ueber-uns/
Disallow /

anthropic-ai
awariorssbot
awariosmartbot
baidu-yunguance
claude-web
cohere-ai
etaospider
omgilibot
opebot-v
peer39_crawler
peer39_crawler/1.0
keydrop.io
censysinspect
xai
grok
grokbot
grokai
websauger
webwhacker
webzip
webcapture
webcapture 2.0
teleport
teleportpro
sitesnagger
vorebot
winhttrack

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.giessener-anzeiger.de/news.xml

Comments

  • robots.txt www.giessener-anzeiger.de
  • Legal notice: www.giessener-anzeiger.de expressly reserves the right to use its content for commercial text and data mining (ยง 44b UrhG).
  • The use of robots or other automated means to access www.giessener-anzeiger.de or collect or mine data without the express permission of www.giessener-anzeiger.de is strictly prohibited.
  • AI
  • scraper
  • seo tools
  • Intelligence Gatherer
  • Archiver
  • undocumented / uncategorized

Warnings

  • 1 invalid line.