samuiforsale.com
robots.txt

Robots Exclusion Standard data for samuiforsale.com

Resource Scan

Scan Details

Site Domain samuiforsale.com
Base Domain samuiforsale.com
Scan Status Ok
Last Scan2026-01-15T11:07:54+00:00
Next Scan 2026-02-14T11:07:54+00:00

Last Scan

Scanned2026-01-15T11:07:54+00:00
URL https://samuiforsale.com/robots.txt
Redirect https://www.samuiforsale.com/robots.txt
Redirect Domain www.samuiforsale.com
Redirect Base samuiforsale.com
Domain IPs 2001:7b8:618:b:250:56ff:fe8c:1efe, 213.154.231.152
Redirect IPs 2001:7b8:618:b:250:56ff:fe8c:1efe, 213.154.231.152
Response IP 213.154.231.152
Found Yes
Hash fa48a3f54d915c2e5d589b9424c943ad265722d0b25d43da67330762c451b5b6
SimHash 201008514cf0

Groups

*

Rule Path
Disallow /administrator/
Disallow /cli/
Disallow /installation/
Disallow /logs/
Disallow /tmp/
Disallow /component/
Allow /modules/*.css
Allow /modules/*.js
Allow /plugins/*.css
Allow /plugins/*.js
Allow /libraries/*.css
Allow /libraries/*.js

gptbot

Product Comment
gptbot OpenAI
Rule Path
Disallow /

claudebot

Product Comment
claudebot Anthropic
Rule Path
Disallow /

ccbot

Product Comment
ccbot Common Crawl (used for datasets)
Rule Path
Disallow /

perplexitybot

Product Comment
perplexitybot Perplexity AI
Rule Path
Allow /

google-extended

Product Comment
google-extended Google’s AI snippets & SGE
Rule Path
Allow /

Other Records

Field Value
sitemap https://www.samuiforsale.com/sitemap-4seo.xml

Comments

  • --------------------------------------------------------------------
  • SamuiForSale robots.txt (Joomla 4/5 + JA Justitia + 4SEO)
  • Last updated: 2025‑07‑22
  • --------------------------------------------------------------------
  • --- Block back‑end and technical folders
  • --- Avoid duplicate component URLs
  • --- Allow critical assets for full rendering
  • --- (Optional) block query‑string variants if they appear
  • Disallow: /*?search=
  • Disallow: /*?filter=
  • --------------------------------------------------------------------
  • XML sitemap
  • --------------------------------------------------------------------
  • =============================
  • AI‑specific directives
  • =============================
  • --- Block model‑training crawlers ---
  • --- Allow AI search‑assistant bots ---
  • --------------------------------------------------------------------
  • End of file
  • --------------------------------------------------------------------