simpliciter.ai
robots.txt

Robots Exclusion Standard data for simpliciter.ai

Resource Scan

Scan Details

Site Domain simpliciter.ai
Base Domain simpliciter.ai
Scan Status Ok
Last Scan2026-01-27T02:20:43+00:00
Next Scan 2026-02-26T02:20:43+00:00

Last Scan

Scanned2026-01-27T02:20:43+00:00
URL https://simpliciter.ai/robots.txt
Domain IPs 104.26.0.111, 104.26.1.111, 172.67.68.216, 2606:4700:20::681a:16f, 2606:4700:20::681a:6f, 2606:4700:20::ac43:44d8
Response IP 104.26.0.111
Found Yes
Hash 2375da6f4f92c3e439244d8856023a005f6b16d6589cb0c0e865129e9fc62805
SimHash c735a353ce55

Groups

*

Rule Path
Allow /

amazonbot

Rule Path
Disallow /

applebot-extended

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

meta-externalagent

Rule Path
Disallow /

*

Rule Path
Disallow /

googlebot

Rule Path
Allow /

bingbot

Rule Path
Allow /

Other Records

Field Value
crawl-delay 15

rogerbot
exabot
mj12bot
dotbot
gigabot
ahrefsbot
blackwidow
chinaclaw
custo
disco
download\ demon
ecatch
eirgrabber
emailsiphon
emailwolf
express\ webpictures
extractorpro
eyenetie
flashget
getright
getweb!
go!zilla
go-ahead-got-it
grabnet
grafula
hmview
httrack
image\ stripper
image\ sucker
indy\ library
interget
internet\ ninja
jetcar
joc\ web\ spider
larbin
leechftp
mass\ downloader
midown\ tool
mister\ pix
navroad
nearsite
netants
netspider
net\ vampire
netzip
octopus
offline\ explorer
offline\ navigator
pagegrabber
papa\ foto
pavuk
pcbrowser
realdownload
reget
sitesnagger
semrushbot
smartdownload
superbot
superhttp
surfbot
takeout
teleport\ pro
voideye
web\ image\ collector
web\ sucker
webauto
webcopier
webfetch
webgo\ is
webleacher
webreaper
websauger
website\ extractor
website\ quester
webstripper
webwhacker
webzip
wget
widow
wwwoffle
xaldon\ webspider
zeus
bytespider
bytedance

Rule Path
Disallow /

oai-searchbot

Rule Path
Disallow /ricerca/
Disallow /app/search/
Allow /

gptbot

Rule Path
Disallow /ricerca/
Disallow /app/search/
Allow /

chatgpt-user

Rule Path
Disallow /ricerca/
Disallow /app/search/
Allow /

perplexitybot

Rule Path
Disallow /ricerca/
Disallow /app/search/
Allow /

perplexity-user

Rule Path
Disallow /ricerca/
Disallow /app/search/
Allow /

claudebot

Rule Path
Disallow /ricerca/
Disallow /app/search/
Allow /

Other Records

Field Value
crawl-delay 5

claude-user

Rule Path
Disallow /ricerca/
Disallow /app/search/
Allow /

Other Records

Field Value
crawl-delay 5

claude-searchbot

Rule Path
Disallow /ricerca/
Disallow /app/search/
Allow /

Other Records

Field Value
crawl-delay 5

gemini-deep-research

Rule Path
Disallow /ricerca/
Disallow /app/search/
Allow /

Other Records

Field Value
crawl-delay 5

Comments

  • As a condition of accessing this website, you agree to abide by the following
  • content signals:
  • (a) If a Content-Signal = yes, you may collect content for the corresponding
  • use.
  • (b) If a Content-Signal = no, you may not collect content for the
  • corresponding use.
  • (c) If the website operator does not include a Content-Signal for a
  • corresponding use, the website operator neither grants nor restricts
  • permission via Content-Signal with respect to the corresponding use.
  • The content signals and their meanings are:
  • search: building a search index and providing search results (e.g., returning
  • hyperlinks and short excerpts from your website's contents). Search does not
  • include providing AI-generated search summaries.
  • ai-input: inputting content into one or more AI models (e.g., retrieval
  • augmented generation, grounding, or other real-time taking of content for
  • generative AI search answers).
  • ai-train: training or fine-tuning AI models.
  • ANY RESTRICTIONS EXPRESSED VIA CONTENT SIGNALS ARE EXPRESS RESERVATIONS OF
  • RIGHTS UNDER ARTICLE 4 OF THE EUROPEAN UNION DIRECTIVE 2019/790 ON COPYRIGHT
  • AND RELATED RIGHTS IN THE DIGITAL SINGLE MARKET.
  • BEGIN Cloudflare Managed content
  • END Cloudflare Managed Content

Warnings

  • `content-signal` is not a known field.