blogtruyen.net
robots.txt

Robots Exclusion Standard data for blogtruyen.net

Archived Snapshots

Resource Scan

Scan Details

Site Domain	blogtruyen.net
Base Domain	blogtruyen.net
Scan Status	Ok
Last Scan	2024-11-12T12:37:34+00:00
Next Scan	2024-11-19T12:37:34+00:00

Last Scan

Scanned	2024-11-12T12:37:34+00:00
URL	https://blogtruyen.net/robots.txt
Domain IPs	104.21.93.176, 172.67.213.73, 2606:4700:3031::6815:5db0, 2606:4700:3033::ac43:d549
Response IP	172.67.213.73
Found	Yes
Hash	fe9c13cb78e77d3cb989cf1f27c998cd8af752475dc9ea3e34ea06fd12cb26fe
SimHash	320f64437062

Groups

*

Rule	Path
Allow	/

Rule

Path

Allow

sentibot

Rule	Path
Disallow	/

Rule

Path

Disallow

megaindex.ru/2.0

Rule	Path
Disallow	/

Rule

Path

Disallow

megaindex.ru

Rule	Path
Disallow	/

Rule

Path

Disallow

advbot

Rule	Path
Disallow	/

Rule

Path

Disallow

xovibot

Rule	Path
Disallow	/

Rule

Path

Disallow

publiclibraryarchive.org

Rule	Path
Disallow	/

Rule

Path

Disallow

memorybot

Rule	Path
Disallow	/

Rule

Path

Disallow

smtbot

Rule	Path
Disallow	/

Rule

Path

Disallow

xovibot

Rule	Path
Disallow	/

Rule

Path

Disallow

abonti

Rule	Path
Disallow	/

Rule

Path

Disallow

meanpathbot

Rule	Path
Disallow	/

Rule

Path

Disallow

searchmetricsbot

Rule	Path
Disallow	/

Rule

Path

Disallow

panscient.com

Rule	Path
Disallow	/

Rule

Path

Disallow

istellabot

Rule	Path
Disallow	/

Rule

Path

Disallow

easouspider

Rule	Path
Disallow	/

Rule

Path

Disallow

aboundexbot

Rule	Path
Disallow	/

Rule

Path

Disallow

mixbot

Rule	Path
Disallow	/

Rule

Path

Disallow

easouspider

Rule	Path
Disallow	/

Rule

Path

Disallow

sogou spider

Rule	Path
Disallow	/

Rule

Path

Disallow

bubing

Rule	Path
Disallow	/

Rule

Path

Disallow

linkpadbot

Rule	Path
Disallow	/

Rule

Path

Disallow

aboundexbot

Rule	Path
Disallow	/

Rule

Path

Disallow

heritrix

Rule	Path
Disallow	/

Rule

Path

Disallow

seokicks-robot

Rule	Path
Disallow	/

Rule

Path

Disallow

wbsearchbot

Rule	Path
Disallow	/

Rule

Path

Disallow

screenerbot

Rule	Path
Disallow	/

Rule

Path

Disallow

unisterbot

Rule	Path
Disallow	/

Rule

Path

Disallow

seznambot

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot

Rule	Path
Disallow	/

Rule

Path

Disallow

bpimagewalker/2.0

Rule	Path
Disallow	/

Rule

Path

Disallow

lipperhey

Rule	Path
Disallow	/

Rule

Path

Disallow

blexbot

Rule	Path
Disallow	/

Rule

Path

Disallow

wotbox

Rule	Path
Disallow	/

Rule

Path

Disallow

siteexplorer

Rule	Path
Disallow	/

Rule

Path

Disallow

turnitinbot

Rule	Path
Disallow	/

Rule

Path

Disallow

netestate ne crawler

Rule	Path
Disallow	/

Rule

Path

Disallow

feedbooster

Rule	Path
Disallow	/

Rule

Path

Disallow

nutch

Rule	Path
Disallow	/

Rule

Path

Disallow

mail.ru

Rule	Path
Disallow	/

Rule

Path

Disallow

ezooms

Rule	Path
Disallow	/

Rule

Path

Disallow

spbot

Rule	Path
Disallow	/

Rule

Path

Disallow

sistrix

Rule	Path
Disallow	/

Rule

Path

Disallow

exb language crawler

Rule	Path
Disallow	/

Rule

Path

Disallow

rogerbot

Rule	Path
Disallow	/

Rule

Path

Disallow

exabot

Rule	Path
Disallow	/

Rule

Path

Disallow

mj12bot

Rule	Path
Disallow	/

Rule

Path

Disallow

dotbot

Rule	Path
Disallow	/

Rule

Path

Disallow

gigabot

Rule

Path

Disallow

ahrefsbot

Rule

Path

Disallow

/doubleclick/

Disallow

/eyeblaster/

Disallow

/tim-kiem/

Disallow

/404/

Other Records

Field

Value

sitemap

https://blogtruyen.net/sitemap.xml

Comments

2015.06.27 crawler for SentiOne
2015.04.06 SEO indexer
2015.02.10 AdvBot "classify web content"
2015.01.30 XoviBot SEO bot
2015.02.19 ??? parked domain
2014.12.26. Internet Memory Research
2014.09.26. SimilarTech, Lead Generation, Competitive Intelligence based on Web Tech Analysis
2014.09.26. XOVI Suite, SEO & Online Marketing Tool
2014.09.18. WebSearch
2014.09.11. The web search API
entries without date
SEO services
panscient.com
tiscali.it search bot
search engine
search engine
Mixdata : data for big business
chinese search engine
chinese search engine
scalable, fully distributed crawler
??? search engine
search engine
the Internet Archive's open-source, extensible, scalable, archival-quality Web crawler
kostenlose Backlinkchecker von Torsten R«äckert Internetdiestleistungen
part of Ware Bay Best Buys Search engine
Web crawler
analyses the structure of the WWW
search engine
seo
brand protection
seo
seo
search engine
seo
plagiarism check
search engine www.sengine.info
news
Apache Nutch based
news portal
seo moz
seo
seo
language

blogtruyen.netrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

sentibot

megaindex.ru/2.0

megaindex.ru

advbot

xovibot

publiclibraryarchive.org

memorybot

smtbot

xovibot

abonti

meanpathbot

searchmetricsbot

panscient.com

istellabot

easouspider

aboundexbot

mixbot

easouspider

sogou spider

bubing

linkpadbot

aboundexbot

heritrix

seokicks-robot

wbsearchbot

screenerbot

unisterbot

seznambot

semrushbot

bpimagewalker/2.0

lipperhey

blexbot

wotbox

siteexplorer

turnitinbot

netestate ne crawler

feedbooster

nutch

mail.ru

ezooms

spbot

sistrix

exb language crawler

rogerbot

exabot

mj12bot

dotbot

gigabot

ahrefsbot

Other Records

Comments

blogtruyen.net
robots.txt