suma.tw
robots.txt

Robots Exclusion Standard data for suma.tw

Resource Scan

Scan Details

Site Domain suma.tw
Base Domain suma.tw
Scan Status Ok
Last Scan2025-11-05T14:40:51+00:00
Next Scan 2025-11-12T14:40:51+00:00

Last Scan

Scanned2025-11-05T14:40:51+00:00
URL https://www.suma.tw/robots.txt
Domain IPs 104.21.54.125, 172.67.138.137, 2606:4700:3034::6815:367d, 2606:4700:3034::ac43:8a89
Response IP 104.21.54.125
Found Yes
Hash 87800552244342ccd518aa1b955c6e2e1736b84a86daf9633f7049069dee3a90
SimHash 66354b50c584

Groups

*

Rule Path
Allow /

amazonbot

Rule Path
Disallow /

applebot-extended

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

meta-externalagent

Rule Path
Disallow /

*

Rule Path
Allow /data/attachment/
Allow /data/avatar/
Allow /data/cache/
Allow /uc_server/data/avatar/
Disallow /api/
Disallow /data/
Disallow /source/
Disallow /install/
Disallow /template/default/
Disallow /config/
Disallow /uc_client/
Disallow /uc_server/
Disallow /admin.php
Disallow /search.php
Disallow /member.php
Disallow /api.php
Disallow /misc.php
Disallow /connect.php
Disallow /forum.php?mod=redirect*
Disallow /forum.php?mod=post*
Disallow /home.php?mod=spacecp*
Disallow /*?mod=misc*
Disallow /*?mod=attachment*
Disallow /*mobile%3Dyes*

amazonbot
anthropic-ai
applebot-extended
awariorssbot
awariosmartbot
bytespider
ccbot
chatgpt-user
claudebot
claude-web
cohere-ai
dataforseobot
facebookbot
google-extended
gptbot
imagesiftbot
magpie-crawler
omgili
omgilibot
peer39_crawler
peer39_crawler/1.0
perplexitybot
youbot

Rule Path
Disallow /

Comments

  • As a condition of accessing this website, you agree to abide by the following
  • content signals:
  • (a) If a content-signal = yes, you may collect content for the corresponding
  • use.
  • (b) If a content-signal = no, you may not collect content for the
  • corresponding use.
  • (c) If the website operator does not include a content signal for a
  • corresponding use, the website operator neither grants nor restricts
  • permission via content signal with respect to the corresponding use.
  • The content signals and their meanings are:
  • search: building a search index and providing search results (e.g., returning
  • hyperlinks and short excerpts from your website's contents). Search does not
  • include providing AI-generated search summaries.
  • ai-input: inputting content into one or more AI models (e.g., retrieval
  • augmented generation, grounding, or other real-time taking of content for
  • generative AI search answers).
  • ai-train: training or fine-tuning AI models.
  • ANY RESTRICTIONS EXPRESSED VIA CONTENT SIGNALS ARE EXPRESS RESERVATIONS OF
  • RIGHTS UNDER ARTICLE 4 OF THE EUROPEAN UNION DIRECTIVE 2019/790 ON COPYRIGHT
  • AND RELATED RIGHTS IN THE DIGITAL SINGLE MARKET.
  • BEGIN Cloudflare Managed content
  • END Cloudflare Managed Content
  • robots.txt for Discuz! X3.5

Warnings

  • `content-signal` is not a known field.