/.well-known/

Log In Sign Up

crn.de
robots.txt

Robots Exclusion Standard data for crn.de

Archived Snapshots

Resource Scan

Scan Details

Site Domain	crn.de
Base Domain	crn.de
Scan Status	Ok
Last Scan	2024-09-27T19:52:00+00:00
Next Scan	2024-10-04T19:52:00+00:00

Last Scan

Scanned	2024-09-27T19:52:00+00:00
URL	https://crn.de/robots.txt
Redirect	https://www.crn.de/robots.txt
Redirect Domain	www.crn.de
Redirect Base	crn.de
Domain IPs	104.18.2.46, 104.18.3.46, 2606:4700::6812:22e, 2606:4700::6812:32e
Redirect IPs	104.18.2.46, 104.18.3.46, 2606:4700::6812:22e, 2606:4700::6812:32e
Response IP	104.18.3.46
Found	Yes
Hash	34bd9552f5f7f9ea95ec7f69dfaadedff1b9844dc26ece9a92b3c98449a7e202
SimHash	2156f252c831

Groups

*

Rule

Path

Allow

/

msnbot

Rule

Path

Allow

/

slurp

Rule

Path

Allow

/

teoma

Rule

Path

Allow

/

gigabot

Rule

Path

Allow

/

robozilla

Rule

Path

Allow

/

nutch

Rule

Path

Allow

/

ia_archiver

Rule

Path

Allow

/

baiduspider

Rule

Path

Allow

/

naverbot

Rule

Path

Allow

/

yeti

Rule

Path

Allow

/

yahoo-mmcrawler

Rule

Path

Allow

/

psbot

Rule

Path

Allow

/

yahoo-blogs/v3.9

Rule

Path

Allow

/

Allow

/cgi-bin/

ahrefsbot
compspybot
crystalsemanticsbot
curious george
cybeye.com
daumoa
docomo
exb language crawler
ezooms
flamingo_searchengine
genieo
genio
gsa-crawler
lexxebot
libcrawl
linkdex
lwnutch
magpie-crawler
meltwater
mnogosearch
omgilibot/0.3
openwebindex
psbot
rediffnewsbot
repparser
scanmine
seoengworldbot
shopwiki
showyoubot
sindice-site-manager
sogou
sogou spider
sosospider
webvac
wocbot
woriobot
yacybot
yeti
yolinkbot_text
youdaobot

Rule

Path

Disallow

/

Back to top

Comments

Sitemap declarations
Fully exclude these robots from crawling anything

Back to top