neglectedbooks.com
robots.txt

Robots Exclusion Standard data for neglectedbooks.com

Archived Snapshots

Resource Scan

Scan Details

Site Domain	neglectedbooks.com
Base Domain	neglectedbooks.com
Scan Status	Ok
Last Scan	2025-09-05T18:55:22+00:00
Next Scan	2025-10-05T18:55:22+00:00

Last Scan

Scanned	2025-09-05T18:55:22+00:00
URL	https://neglectedbooks.com/robots.txt
Domain IPs	104.21.40.51, 172.67.176.116
Response IP	172.67.176.116
Found	Yes
Hash	2c69f9b6ba7c21f7a356c9919018f099c6132810135a6714e329ad539bbbd848
SimHash	29e0def2fbc1

Groups

becomebot

Rule	Path
Disallow	/cgi-bin/

Rule

Path

Disallow

/cgi-bin/

sbider

Rule	Path
Disallow	/

Rule

Path

Disallow

nutch

Rule	Path
Disallow	/

Rule

Path

Disallow

wisenutbot

Rule	Path
Disallow	/

Rule

Path

Disallow

jambot

Rule	Path
Disallow	/

Rule

Path

Disallow

nutchcvs

Rule	Path
Disallow	/

Rule

Path

Disallow

psbot

Rule	Path
Disallow	/

Rule

Path

Disallow

htdig

Rule	Path
Disallow	/

Rule

Path

Disallow

example

Rule	Path
Disallow	/

Rule

Path

Disallow

findlinks

Rule	Path
Disallow	/

Rule

Path

Disallow

larbin_2.6.3

Rule	Path
Disallow	/

Rule

Path

Disallow

yahoo-mmcrawler

Rule	Path
Disallow	/

Rule

Path

Disallow

syntryx ant scout chassis pheromone

Rule	Path
Disallow	/

Rule

Path

Disallow

isc systems irc search

Rule	Path
Disallow	/

Rule

Path

Disallow

yahoo-mmcrawler

Rule	Path
Disallow	/

Rule

Path

Disallow

sphere scout

Rule	Path
Disallow	/

Rule

Path

Disallow

irlbot

Rule	Path
Disallow	/

Rule

Path

Disallow

*

Rule	Path
Disallow

Rule

Path

Disallow

Other Records

Field	Value
crawl-delay	1

Field

Value

crawl-delay

Comments

Wildcard robots.txt agent (ask all bots to crawl slowly but still index everything).

neglectedbooks.comrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

becomebot

sbider

nutch

wisenutbot

jambot

nutchcvs

psbot

htdig

example

findlinks

larbin_2.6.3

yahoo-mmcrawler

syntryx ant scout chassis pheromone

isc systems irc search

yahoo-mmcrawler

sphere scout

irlbot

*

Other Records

Comments

neglectedbooks.com
robots.txt