clustag.com
robots.txt

Robots Exclusion Standard data for clustag.com

Resource Scan

Scan Details

Site Domain clustag.com
Base Domain clustag.com
Scan Status Failed
Failure StageFetching resource.
Failure ReasonCouldn't connect to server.
Last Scan2026-02-12T18:22:02+00:00
Next Scan 2026-04-13T18:22:02+00:00

Last Successful Scan

Scanned2025-11-22T18:19:18+00:00
URL https://clustag.com/robots.txt
Domain IPs 149.62.172.228
Response IP 149.62.172.228
Found Yes
Hash 8d6f26ef8b24ffb636891591daf8e75e50103109f404811facb27717194fd21c
SimHash 60945a5a0521

Groups

*

Rule Path
Disallow /wp-admin/
Allow /wp-admin/admin-ajax.php
Disallow /wp-includes/
Disallow /wp-content/plugins/
Disallow /wp-content/themes/
Disallow /wp-content/cache/
Disallow /xmlrpc.php
Disallow /readme.html
Disallow /license.txt
Disallow /wp-config.php
Disallow /price/
Disallow /trackback/
Disallow /feed/
Disallow /rdf/
Disallow /rss/
Disallow /comments/feed/
Disallow /search/
Disallow /?s=
Disallow /price/?default_id=SV388SV
Allow /wp-content/uploads/

gptbot

Rule Path
Allow /

chatgpt-user

Rule Path
Allow /

google-extended

Rule Path
Allow /

ccbot

Rule Path
Allow /

anthropic-ai

Rule Path
Allow /

claudebot

Rule Path
Allow /

perplexitybot

Rule Path
Allow /

cohere-ai

Rule Path
Allow /

bytespider

Rule Path
Allow /

Other Records

Field Value
sitemap https://clustag.com/sitemap_index.xml

Comments

  • Disallow access to WordPress admin area
  • Allow access to admin-ajax.php as it's used by some themes/plugins on the frontend
  • Disallow core WordPress files and directories
  • Disallow trackbacks, pingbacks, and feeds if not needed or to prevent duplicate content
  • Disallow WordPress search results pages
  • Allow crawlers to access uploaded files (images, PDFs, etc.)
  • WordPress sitemap
  • OpenAI
  • Google AI
  • Common Crawl
  • Anthropic
  • Perplexity AI
  • Cohere AI
  • ByteDance