lj.rossia.org
robots.txt

Robots Exclusion Standard data for lj.rossia.org

Archived Snapshots

Resource Scan

Scan Details

Site Domain	lj.rossia.org
Base Domain	rossia.org
Scan Status	Ok
Last Scan	2025-12-02T01:22:23+00:00
Next Scan	2026-01-01T01:22:23+00:00

Last Scan

Scanned	2025-12-02T01:22:23+00:00
URL	https://lj.rossia.org/robots.txt
Redirect	http://lj.rossia.org/robots.txt
Domain IPs	163.172.215.104
Response IP	163.172.215.104
Found	Yes
Hash	f7e971cef21ac00d52f1a3cb47303e4725d9746237820b0269d5e77b9522f4e2
SimHash	e57db8508fb1

Groups

*

Rule	Path
Disallow	/directory
Disallow	/interests
Disallow	/tools/tell
Disallow	/tools/memadd
Disallow	/tools/search.bml
Disallow	/friends/
Disallow	/interface/
Disallow	/translate/
Disallow	/comments/
Disallow	/numreplies/
Disallow	/users/imp_
Disallow	/userinfo.bml?user=imp_
Disallow	/talk
Disallow	/stats/stats.txt
Disallow	/create
Disallow	/update

Rule

Path

Disallow

/directory

Disallow

/interests

Disallow

/tools/tell

Disallow

/tools/memadd

Disallow

/tools/search.bml

Disallow

/friends/

Disallow

/interface/

Disallow

/translate/

Disallow

/comments/

Disallow

/numreplies/

Disallow

/users/imp_

Disallow

/userinfo.bml?user=imp_

Disallow

/talk

Disallow

/stats/stats.txt

Disallow

/create

Disallow

/update

Other Records

Field	Value
crawl-delay	1

Field

Value

crawl-delay

1

Back to top

Comments

Blocked journals aren't listed here because robots.txt files
can't be above 50k or so, depending on the spider.
Instead, blocked journals have HTML inserted in them which
should prevent behaved spiders from indexing it.

Back to top

Warnings

`host` is not a known field.

Back to top

lj.rossia.orgrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

Other Records

Comments

Warnings

lj.rossia.org
robots.txt