systemcycle.com
robots.txt

Robots Exclusion Standard data for systemcycle.com

Resource Scan

Scan Details

Site Domain systemcycle.com
Base Domain systemcycle.com
Scan Status Ok
Last Scan2025-11-15T17:01:51+00:00
Next Scan 2025-11-29T17:01:51+00:00

Last Scan

Scanned2025-11-15T17:01:51+00:00
URL https://systemcycle.com/robots.txt
Redirect https://www.systemcycle.com/robots.txt
Redirect Domain www.systemcycle.com
Redirect Base systemcycle.com
Domain IPs 23.227.38.32
Redirect IPs 23.227.38.74, 2620:127:f00f:e::
Response IP 23.227.38.74
Found Yes
Hash e3a222bfaf62dd1130091cce323422247d0b886a69fd49a11867bc14a6d41804
SimHash 791cd3104652

Groups

*

Rule Path
Disallow /collections/
Disallow /products/
Disallow /cart
Disallow /checkout
Disallow /orders
Disallow /account
Disallow /search
Disallow /pages/wholesale
Disallow /pages/distributor
Allow /

gptbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

amazonbot

Rule Path
Disallow /

amazongpt

Rule Path
Disallow /

dataforseobot

Rule Path
Disallow /

youbot

Rule Path
Disallow /

applebot-extended

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /

meta-externalagent

Rule Path
Disallow /

duckassistbot

Rule Path
Disallow /

kagibot

Rule Path
Disallow /

ia_archiver

Rule Path
Disallow /

bingpreview

Rule Path
Disallow /

microsoft-extended

Rule Path
Disallow /

scrapy

Rule Path
Disallow /

crawler

Rule Path
Disallow /

python-requests

Rule Path
Disallow /

wget

Rule Path
Disallow /

curl

Rule Path
Disallow /

applebot-image

Rule Path
Disallow /

googlebot-image

Rule Path
Disallow /

pinterestbot

Rule Path
Disallow /

baiduspider-image

Rule Path
Disallow /
Disallow /*.jpg$
Disallow /*.jpeg$
Disallow /*.png$
Disallow /*.gif$
Disallow /*.webp$
Disallow /*.svg$
Disallow /*.avif$

Other Records

Field Value
sitemap https:///sitemap.xml

Comments

  • -------------------------------
  • General search engine rules
  • -------------------------------
  • -------------------------------
  • Block AI crawlers and data scrapers
  • -------------------------------
  • OpenAI / ChatGPT
  • Anthropic / Claude
  • Google Gemini
  • Perplexity
  • CommonCrawler (used for many AI datasets)
  • Amazon / Alexa / AmazonGPT
  • DataForSEO
  • You.com
  • Apple
  • Meta / Facebook
  • DuckDuckGo AI
  • Kagi / Other AI-assisted crawlers
  • Internet Archive (optional)
  • Microsoft / Bing AI data extension
  • Generic scraping tools
  • -------------------------------
  • Image protection
  • -------------------------------
  • Block AI image crawlers and visual dataset collectors
  • Disallow direct crawling of common image file types