databases.lovd.nl
robots.txt

Robots Exclusion Standard data for databases.lovd.nl

Archived Snapshots

Resource Scan

Scan Details

Site Domain	databases.lovd.nl
Base Domain	lovd.nl
Scan Status	Ok
Last Scan	2025-10-15T11:28:49+00:00
Next Scan	2025-11-14T11:28:49+00:00

Last Scan

Scanned	2025-10-15T11:28:49+00:00
URL	https://databases.lovd.nl/robots.txt
Domain IPs	145.88.210.19
Response IP	145.88.210.19
Found	Yes
Hash	c728ab4a01121dcf98bd431e255edf8451a95c5e86ce99da44d30e2a9fe7a854
SimHash	aa1e51510272

Groups

mj12bot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

semrushbot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

petalbot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

turnitinbot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

ccbot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

owler

Rule	Path
Disallow	/

Rule

Path

Disallow

/

amazonbot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

blexbot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

*

No rules defined. All paths allowed.

Other Records

Field	Value
crawl-delay	5

Field

Value

crawl-delay

5

awariorssbot
awariosmartbot

No rules defined. All paths allowed.

Other Records

Field	Value
crawl-delay	10

Field

Value

crawl-delay

10

terracotta

Rule	Path
Disallow	/

Rule

Path

Disallow

/

Back to top

Comments

Because it causes HTTP 406 errors everywhere.
Because it causes HTTP 406 errors everywhere.
This bot reads, but ignores the robots.txt file.
Because it's being an idiot and it ignores the BASE HREF tag.
Has no use crawling our site but causes screen scraping warnings.
Nope. Just nope. Downloads everything and then lets others use it without restriction.
Buggy. Doens't understand what a BASE HREF is.
Doesn't support crawl-delay. OK, leave us alone, then.
Repeated requests to the same pages and downloads lots of variant data that isn't useful for the purpose of the bot.
Slow down, boys.
Slow these down even more, since they don't follow the 'User-agent: *' rule.
Annoying crawler; sends immense amounts of HEAD requests using an UA other than its own which annoys my scraping detection scripts.

Back to top

databases.lovd.nlrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

mj12bot

semrushbot

petalbot

turnitinbot

ccbot

owler

amazonbot

blexbot

*

Other Records

awariorssbotawariosmartbot

Other Records

terracotta

Comments

databases.lovd.nl
robots.txt

awariorssbot
awariosmartbot