trscans.site
robots.txt

Robots Exclusion Standard data for trscans.site

Archived Snapshots

Resource Scan

Scan Details

Site Domain	trscans.site
Base Domain	trscans.site
Scan Status	Ok
Last Scan	2025-10-21T12:55:21+00:00
Next Scan	2025-10-28T12:55:21+00:00

Last Scan

Scanned	2025-10-21T12:55:21+00:00
URL	https://trscans.site/robots.txt
Domain IPs	104.21.34.44, 172.67.197.233, 2606:4700:3030::6815:222c, 2606:4700:3030::ac43:c5e9
Response IP	104.21.34.44
Found	Yes
Hash	5335c10e02a5fd01bee3439197c4ff0c8d78ff727097abdd387db4a5e544f4da
SimHash	44354a41e4b0

Groups

*

Rule	Path
Allow	/

Rule

Path

Allow

amazonbot

Rule	Path
Disallow	/

Rule

Path

Disallow

applebot-extended

Rule	Path
Disallow	/

Rule

Path

Disallow

bytespider

Rule	Path
Disallow	/

Rule

Path

Disallow

ccbot

Rule	Path
Disallow	/

Rule

Path

Disallow

claudebot

Rule	Path
Disallow	/

Rule

Path

Disallow

google-extended

Rule	Path
Disallow	/

Rule

Path

Disallow

gptbot

Rule	Path
Disallow	/

Rule

Path

Disallow

meta-externalagent

Rule	Path
Disallow	/

Rule

Path

Disallow

*

Rule	Path
Allow	/
Allow	/novel/
Allow	/novel/*/chapter/
Allow	/search
Allow	/genre/
Allow	/tag/
Allow	/assets/
Allow	/images/
Allow	/css/
Allow	/js/
Allow	/fonts/
Allow	/favicon.ico
Allow	/favicon/
Disallow	/admin/
Disallow	/api/admin/
Disallow	/private/
Disallow	/_internal/
Disallow	/?utm_
Disallow	/?ref=
Disallow	/?source=
Disallow	/?from=
Disallow	/ads/
Disallow	/popups/
Disallow	/banners/
Disallow	/sponsored/
Disallow	/novel.html
Disallow	/chapter.html

Rule

Path

Allow

/novel/

Allow

/novel/*/chapter/

Allow

/search

Allow

/genre/

Allow

/tag/

Allow

/assets/

Allow

/images/

Allow

/css/

Allow

/js/

Allow

/fonts/

Allow

/favicon.ico

Allow

/favicon/

Disallow

/admin/

Disallow

/api/admin/

Disallow

/private/

Disallow

/_internal/

Disallow

/*?utm_*

Disallow

/*?ref=*

Disallow

/*?source=*

Disallow

/*?from=*

Disallow

/ads/

Disallow

/popups/

Disallow

/banners/

Disallow

/sponsored/

Disallow

/novel.html

Disallow

/chapter.html

googlebot

Rule	Path
Allow	/

Rule

Path

Allow

Other Records

Field	Value
crawl-delay	1

Field

Value

crawl-delay

bingbot

Rule	Path
Allow	/

Rule

Path

Allow

Other Records

Field	Value
crawl-delay	2

Field

Value

crawl-delay

yandex

Rule	Path
Allow	/

Rule

Path

Allow

Other Records

Field	Value
crawl-delay	2

Field

Value

crawl-delay

duckduckbot

Rule	Path
Allow	/

Rule

Path

Allow

Other Records

Field	Value
crawl-delay	1

Field

Value

crawl-delay

baiduspider

Rule	Path
Allow	/

Rule

Path

Allow

Other Records

Field	Value
crawl-delay	3

Field

Value

crawl-delay

facebookexternalhit

Rule	Path
Allow	/

Rule

Path

Allow

twitterbot

Rule	Path
Allow	/

Rule

Path

Allow

linkedinbot

Rule	Path
Allow	/

Rule

Path

Allow

Rule	Path
Allow	/

Rule

Path

Allow

ahrefsbot

No rules defined. All paths allowed.

Other Records

Field	Value
crawl-delay	30

Field

Value

crawl-delay

semrushbot

No rules defined. All paths allowed.

Other Records

Field	Value
crawl-delay	30

Field

Value

crawl-delay

mj12bot

Rule	Path
Disallow	/

Rule

Path

Disallow

dotbot

Rule	Path
Disallow	/

Rule

Path

Disallow

Other Records

Field	Value
sitemap	https://trscans.site/sitemap.xml
sitemap	https://trscans.site/bot/sitemap.xml

Field

Value

sitemap

https://trscans.site/sitemap.xml

sitemap

https://trscans.site/bot/sitemap.xml

Comments

As a condition of accessing this website, you agree to abide by the following
content signals:
(a) If a content-signal = yes, you may collect content for the corresponding
use.
(b) If a content-signal = no, you may not collect content for the
corresponding use.
(c) If the website operator does not include a content signal for a
corresponding use, the website operator neither grants nor restricts
permission via content signal with respect to the corresponding use.
The content signals and their meanings are:
search: building a search index and providing search results (e.g., returning
hyperlinks and short excerpts from your website's contents). Search does not
include providing AI-generated search summaries.
ai-input: inputting content into one or more AI models (e.g., retrieval
augmented generation, grounding, or other real-time taking of content for
generative AI search answers).
ai-train: training or fine-tuning AI models.
ANY RESTRICTIONS EXPRESSED VIA CONTENT SIGNALS ARE EXPRESS RESERVATIONS OF
AND RELATED RIGHTS IN THE DIGITAL SINGLE MARKET.
BEGIN Cloudflare Managed content
END Cloudflare Managed Content
Optimized robots.txt for TRSCANS - Light Novel Website
Allow all crawlers to access main content
Explicitly allow important pages for SEO
Allow static assets
Block admin and sensitive areas
Block duplicate content patterns
Block ad-related paths
Block old legacy URLs to prevent duplicate content
Special rules for specific bots
Allow social media bots for better sharing
Block aggressive crawlers
Sitemaps
Host declaration for clarity

Warnings

`content-signal` is not a known field.
`host` is not a known field.

/.well-known/

trscans.site
robots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

amazonbot

applebot-extended

bytespider

ccbot

claudebot

google-extended

gptbot

meta-externalagent

*

googlebot

Other Records

bingbot

Other Records

yandex

Other Records

duckduckbot

Other Records

baiduspider

Other Records

facebookexternalhit

twitterbot

linkedinbot

whatsapp

ahrefsbot

Other Records

semrushbot

Other Records

mj12bot

dotbot

Other Records

Comments

Warnings

trscans.siterobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

amazonbot

applebot-extended

bytespider

ccbot

claudebot

google-extended

gptbot

meta-externalagent

*

googlebot

Other Records

bingbot

Other Records

yandex

Other Records

duckduckbot

Other Records

baiduspider

Other Records

facebookexternalhit

twitterbot

linkedinbot

whatsapp

ahrefsbot

Other Records

semrushbot

Other Records

mj12bot

dotbot

Other Records

Comments

Warnings

trscans.site
robots.txt