davidduggleby.com
robots.txt

Robots Exclusion Standard data for davidduggleby.com

Resource Scan

Scan Details

Site Domain davidduggleby.com
Base Domain davidduggleby.com
Scan Status Ok
Last Scan2025-08-31T07:57:04+00:00
Next Scan 2025-09-30T07:57:04+00:00

Last Scan

Scanned2025-08-31T07:57:04+00:00
URL https://davidduggleby.com/robots.txt
Domain IPs 78.129.243.79
Response IP 78.129.243.79
Found Yes
Hash 8718f2cdbaf96932c57f022bb0ae5dd18ca8e070f45060531ddd3904309c78e3
SimHash 53944105c7d3

Groups

*

Rule Path
Disallow *?url=*
Disallow *?URL=*
Disallow /?*%2F
Disallow *?search=
Disallow *utm_source%3D*
Disallow *utm_campaign%3D*
Disallow *?d=*
Disallow */live/*
Allow *page%3D*
Allow /Discover/*
Allow /members/$
Disallow /*/forgot.aspx

claudebot
claude-user

Rule Path
Allow /

anthropic-ai

Rule Path
Allow /

oai-searchbot
chatgpt-user
chatgpt-user/2.0

Rule Path
Allow /

google-extended

Rule Path
Allow /

bingbot

Rule Path
Allow /

perplexity-user

Rule Path
Allow /

perplexitybot

Rule Path
Allow /

mistralai-user

Rule Path
Allow /

youbot

Rule Path
Allow /

timpibot

Rule Path
Allow /

duckassistbot

Rule Path
Allow /

ccbot

Rule Path
Allow /

amazonbot

Rule Path
Allow /

applebot
applebot-extended

Rule Path
Allow /

Comments

  • Allow agentic-AI users
  • User-driven browsing from ChatGPT