simone.org
robots.txt

Robots Exclusion Standard data for simone.org

Resource Scan

Scan Details

Site Domain simone.org
Base Domain simone.org
Scan Status Ok
Last Scan2025-10-11T17:55:54+00:00
Next Scan 2025-10-12T17:55:54+00:00

Last Scan

Scanned2025-10-11T17:55:54+00:00
URL https://simone.org/robots.txt
Domain IPs 151.101.131.7, 151.101.195.7, 151.101.3.7, 151.101.67.7, 2a04:4e42:200::775, 2a04:4e42:400::775, 2a04:4e42:600::775, 2a04:4e42::775
Response IP 151.101.131.7
Found Yes
Hash 51a5849c02156041278647a7bb4731690b98cb19253412a48f9319a5669421a8
SimHash 401dd941e6b1

Groups

adsbot-google
amazonbot
anthropic-ai
applebot-extended
awariorssbot
awariosmartbot
bytespider
bytedancespider
ccbot
chatgpt-user
claudebot
claude-web
cohere-ai
dataforseobot
diffbot
facebookbot
friendlycrawler
google-extended
googleother
gptbot
img2dataset
imagesiftbot
magpie-crawler
meltwater
omgili
omgilibot
peer39_crawler
peer39_crawler/1.0
perplexitybot
piplbot
scoop.it
seekr
youbot

Rule Path
Disallow /

ahrefsbot
semrushbot
mj12bot
dotbot

Rule Path
Disallow /

googlebot
googlebot-news
googlebot-video
googlebot-image
bingbot
bingbot-image
yandexbot
yandeximages
baiduspider
baiduspider-image
duckduckbot
pinterestbot

Rule Path
Disallow

*

Rule Path
Disallow

Other Records

Field Value
sitemap https://simone.org/sitemap.xml

Comments

  • Block AI and data collection bots
  • Block aggressive SEO tools
  • Main search engines and their bots
  • All other crawlers
  • Sitemap location