thinkdata.be
robots.txt

Robots Exclusion Standard data for thinkdata.be

Archived Snapshots

Resource Scan

Scan Details

Site Domain	thinkdata.be
Base Domain	thinkdata.be
Scan Status	Ok
Last Scan	2025-10-17T11:21:00+00:00
Next Scan	2025-10-31T11:21:00+00:00

Last Scan

Scanned	2025-10-17T11:21:00+00:00
URL	https://thinkdata.be/robots.txt
Domain IPs	104.21.14.253, 172.67.160.216, 2606:4700:3034::6815:efd, 2606:4700:3034::ac43:a0d8
Response IP	104.21.14.253
Found	Yes
Hash	be20caa265bb2cd462a0f8e734137bf3a72bc23dd28ce842cedbfd3f4be677d1
SimHash	2c4a9e5106e1

Groups

*

Rule	Path
Allow	/
Disallow	/admin/
Disallow	/wp-admin/
Disallow	/wp-includes/
Disallow	/wp-content/plugins/
Disallow	/wp-content/themes/
Disallow	/cgi-bin/
Disallow	/private/
Disallow	/temp/
Disallow	/tmp/
Disallow	/*.log$
Disallow	/*.sql$
Disallow	/*.gz$
Disallow	/*.tar$
Disallow	/*.zip$
Allow	/*.css$
Allow	/*.js$
Disallow	/?
Disallow	/search/
Disallow	/search
Disallow	/login/
Disallow	/register/
Disallow	/account/

Rule

Path

Allow

/

Disallow

/admin/

Disallow

/wp-admin/

Disallow

/wp-includes/

Disallow

/wp-content/plugins/

Disallow

/wp-content/themes/

Disallow

/cgi-bin/

Disallow

/private/

Disallow

/temp/

Disallow

/tmp/

Disallow

/*.log$

Disallow

/*.sql$

Disallow

/*.gz$

Disallow

/*.tar$

Disallow

/*.zip$

Allow

/*.css$

Allow

/*.js$

Disallow

/*?*

Disallow

/search/

Disallow

/*search*

Disallow

/login/

Disallow

/register/

Disallow

/account/

Other Records

Field	Value
crawl-delay	1

Field

Value

crawl-delay

1

Back to top

Other Records

Field	Value
sitemap	https://thinkdata.be/sitemap.xml

Field

Value

sitemap

https://thinkdata.be/sitemap.xml

Back to top

Comments

Robots.txt for thinkdata.be
Personal website and portfolio of ThinkData, company by Kenny Helsens, delivering services on AI, Software Engineering and Biotechnology. Based on Belgium, open to working remotely.
Allow all web crawlers to access the site
Specify sitemap location
Block access to common administrative and sensitive directories
Block access to common file types that shouldn't be indexed
Allow CSS and JS files for proper rendering
Block search and filter pages to avoid duplicate content
Block login and registration pages
Crawl delay to be respectful to server resources

Back to top

thinkdata.berobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

Other Records

Other Records

Comments

thinkdata.be
robots.txt