sandyjoe.com
robots.txt

Robots Exclusion Standard data for sandyjoe.com

Resource Scan

Scan Details

Site Domain sandyjoe.com
Base Domain sandyjoe.com
Scan Status Ok
Last Scan2025-10-28T00:04:30+00:00
Next Scan 2025-11-04T00:04:30+00:00

Last Scan

Scanned2025-10-28T00:04:30+00:00
URL https://sandyjoe.com/robots.txt
Domain IPs 104.21.42.116, 172.67.161.169, 2606:4700:3030::ac43:a1a9, 2606:4700:3034::6815:2a74
Response IP 104.21.42.116
Found Yes
Hash 1b03a530c44833e9ce4ecacadd11ebdbd81d844f2af536f7843526150572cc6a
SimHash 551cc961a414

Groups

mediapartners-google
google-inspectiontool
googlebot
bingbot
duckduckbot
duckduckgo
pinterestbot

Rule Path
Allow /
Allow /ads.txt
Disallow /?s=
Disallow /page/*/?s=
Disallow /search/
Disallow /page/

amazonbot

Rule Path
Disallow /

applebot

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

diffbot

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

the knowledge ai

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

imagesiftbot

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

omgili

Rule Path
Disallow /

openai

Rule Path
Disallow /

Other Records

Field Value
sitemap https://sandyjoe.com/sitemap_index.xml
sitemap https://sandyjoe.com/post-sitemap.xml
sitemap https://sandyjoe.com/page-sitemap.xml