thinkdata.be
robots.txt

Robots Exclusion Standard data for thinkdata.be

Resource Scan

Scan Details

Site Domain thinkdata.be
Base Domain thinkdata.be
Scan Status Ok
Last Scan2025-10-17T11:21:00+00:00
Next Scan 2025-10-31T11:21:00+00:00

Last Scan

Scanned2025-10-17T11:21:00+00:00
URL https://thinkdata.be/robots.txt
Domain IPs 104.21.14.253, 172.67.160.216, 2606:4700:3034::6815:efd, 2606:4700:3034::ac43:a0d8
Response IP 104.21.14.253
Found Yes
Hash be20caa265bb2cd462a0f8e734137bf3a72bc23dd28ce842cedbfd3f4be677d1
SimHash 2c4a9e5106e1

Groups

*

Rule Path
Allow /
Disallow /admin/
Disallow /wp-admin/
Disallow /wp-includes/
Disallow /wp-content/plugins/
Disallow /wp-content/themes/
Disallow /cgi-bin/
Disallow /private/
Disallow /temp/
Disallow /tmp/
Disallow /*.log$
Disallow /*.sql$
Disallow /*.gz$
Disallow /*.tar$
Disallow /*.zip$
Allow /*.css$
Allow /*.js$
Disallow /*?*
Disallow /search/
Disallow /*search*
Disallow /login/
Disallow /register/
Disallow /account/

Other Records

Field Value
crawl-delay 1

Other Records

Field Value
sitemap https://thinkdata.be/sitemap.xml

Comments

  • Robots.txt for thinkdata.be
  • Personal website and portfolio of ThinkData, company by Kenny Helsens, delivering services on AI, Software Engineering and Biotechnology. Based on Belgium, open to working remotely.
  • Allow all web crawlers to access the site
  • Specify sitemap location
  • Block access to common administrative and sensitive directories
  • Block access to common file types that shouldn't be indexed
  • Allow CSS and JS files for proper rendering
  • Block search and filter pages to avoid duplicate content
  • Block login and registration pages
  • Crawl delay to be respectful to server resources