robhandgraaf.nl
robots.txt

Robots Exclusion Standard data for robhandgraaf.nl

Archived Snapshots

Resource Scan

Scan Details

Site Domain	robhandgraaf.nl
Base Domain	robhandgraaf.nl
Scan Status	Failed
Failure Stage	Fetching resource.
Failure Reason	Couldn't connect to server.
Last Scan	2024-03-01T08:47:29+00:00
Next Scan	2024-05-30T08:47:29+00:00

Last Successful Scan

Scanned	2021-07-20T03:19:50+00:00
URL	https://robhandgraaf.nl/robots.txt
Redirect	https://www.robhandgraaf.nl/robots.txt
Redirect Domain	www.robhandgraaf.nl
Redirect Base	robhandgraaf.nl
Found	Yes
Hash	ba87b3d96fca524572f2a2b6624ca39d81d272617163002b46e89ee833c036a1
SimHash	4147214b65b2

Groups

semrushbot

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot-sa

Rule	Path
Disallow	/

Rule

Path

Disallow

ahrefsbot

Rule	Path
Disallow	/

Rule

Path

Disallow

rogerbot

Rule	Path
Disallow	/

Rule

Path

Disallow

dotbot

Rule	Path
Disallow	/

Rule

Path

Disallow

ia_archiver

Rule	Path
Disallow	/

Rule

Path

Disallow

velenpublicwebcrawler

Rule	Path
Disallow	/

Rule

Path

Disallow

baiduspider

Rule	Path
Disallow	/

Rule

Path

Disallow

sogou spider

Rule	Path
Disallow	/

Rule

Path

Disallow

youdaobot

Rule	Path
Disallow	/

Rule

Path

Disallow

yandex

Rule	Path
Disallow	/

Rule

Path

Disallow

adsbot-google

Rule	Path
Disallow	/js/

Rule

Path

Disallow

/js/

alphaseobot

Rule	Path
Disallow	/

Rule

Path

Disallow

siteexplorer

Rule	Path
Disallow	/

Rule

Path

Disallow

sitesucker

Rule	Path
Disallow	/

Rule

Path

Disallow

openindexspider

Rule	Path
Disallow	/

Rule

Path

Disallow

booglebot

Rule	Path
Disallow	/

Rule

Path

Disallow

backlinkcrawler

Rule	Path
Disallow	/

Rule

Path

Disallow

zoominfobot

Rule	Path
Disallow	/

Rule

Path

Disallow

seznambot

Rule	Path
Disallow	/

Rule

Path

Disallow

seznambot

Rule	Path
Disallow	/

Rule

Path

Disallow

netestate ne crawler (+http://www.website-datenbank.de/)

Rule	Path
Disallow	/

Rule

Path

Disallow

zoominfobot

Rule	Path
Disallow	/

Rule

Path

Disallow

blexbot

Rule	Path
Disallow	/

Rule

Path

Disallow

mj12bot

Rule	Path
Disallow	/

Rule

Path

Disallow

hubspot crawler

Rule	Path
Disallow	/

Rule

Path

Disallow

seznambot

Rule	Path
Disallow	/

Rule

Path

Disallow

mail.ru_bot

Rule	Path
Disallow	/

Rule

Path

Disallow

mail.ru

Rule	Path
Disallow	/

Rule

Path

Disallow

serpstatbot

Rule	Path
Disallow	/

Rule

Path

Disallow

baiduspider

Rule	Path
Disallow	/

Rule

Path

Disallow

megaindex.ru

Rule	Path
Disallow	/

Rule

Path

Disallow

megaindex.com

Rule	Path
Disallow	/

Rule

Path

Disallow

bingbot
*

Rule	Path
Allow	/
Disallow	/about/
Disallow	/products/
Disallow	/projects/
Disallow	/contact/
Disallow	/about/
Disallow	/contact.html
Disallow	/about.html
Disallow	/contact-us.html
Disallow	/service.html
Disallow	/service/

Rule

Path

Allow

Disallow

/about/

Disallow

/products/

Disallow

/projects/

Disallow

/contact/

Disallow

/about/

Disallow

/contact.html

Disallow

/about.html

Disallow

/contact-us.html

Disallow

/service.html

Disallow

/service/

Comments

-----------------------------------------------------------
robots.txt for http(s)://tinycms.xyz, last refresh 2019/09/30
-----------------------------------------------------------
not all bots below may obey robots.txt in general
or specific rules, respectively
cat /home/wwwlogs/access.log | awk -F\" '{print $6}' | sort | uniq -c | sort -nr | head -20
-----------------------------------------------------------
semrush bot
ahrefs bot
moz bot
Wayback Machine
https://velen.io/
Baiduspider
Block SoGou
Block Youdao
Yandex
AdsBot
http://alphaseobot.com/bot.html
http://siteexplorer.info/about.html
http://www.sitesucker.us/mac/limitations.html
https://www.openindex.io/saas/about-our-spider/
http://www.backlinktest.com/crawler.html
http://napoveda.seznam.cz/
http://www.website-datenbank.de
Block netEstate NE Crawler (+http://www.website-datenbank.de/)
Block BlexBot
https://megaindex.com/crawler
------------
not exclude
------------

Warnings

`crawl-delay` is not a known field.

robhandgraaf.nlrobots.txt

Resource Scan

Scan Details

Last Successful Scan

Groups

semrushbot

semrushbot-sa

ahrefsbot

rogerbot

dotbot

ia_archiver

velenpublicwebcrawler

baiduspider

sogou spider

youdaobot

yandex

adsbot-google

alphaseobot

siteexplorer

sitesucker

openindexspider

booglebot

backlinkcrawler

zoominfobot

seznambot

seznambot

netestate ne crawler (+http://www.website-datenbank.de/)

zoominfobot

blexbot

mj12bot

hubspot crawler

seznambot

mail.ru_bot

mail.ru

serpstatbot

baiduspider

megaindex.ru

megaindex.com

bingbot*

Comments

Warnings

robhandgraaf.nl
robots.txt

bingbot
*