vanheusden.com
robots.txt

Robots Exclusion Standard data for vanheusden.com

Resource Scan

Scan Details

Site Domain vanheusden.com
Base Domain vanheusden.com
Scan Status Ok
Last Scan2026-01-10T07:03:11+00:00
Next Scan 2026-01-17T07:03:11+00:00

Last Scan

Scanned2026-01-10T07:03:11+00:00
URL https://vanheusden.com/robots.txt
Domain IPs 2a02:898:62:f6::9f, 94.142.246.159
Response IP 94.142.246.159
Found Yes
Hash 6e66adcce27ca88a54d6b5b9a464da094874f43dcf21b124a198bb52502d5ad1
SimHash 3454d940e192

Groups

anthropic-ai

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

piplbot

Rule Path
Disallow /

*

Rule Path
Disallow /bad_crawlers.php