reginachain.net
robots.txt

Robots Exclusion Standard data for reginachain.net

Resource Scan

Scan Details

Site Domain reginachain.net
Base Domain reginachain.net
Scan Status Ok
Last Scan2026-01-08T23:55:12+00:00
Next Scan 2026-02-07T23:55:12+00:00

Last Scan

Scanned2026-01-08T23:55:12+00:00
URL https://reginachain.net/robots.txt
Domain IPs 185.2.4.44
Response IP 185.2.4.44
Found Yes
Hash 37b040d8186f26b35f45cb02750b293c9308ff476d27269db75ac21dfc485065
SimHash 502cf3700732

Groups

ccbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

omgili

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

youbot

Rule Path
Disallow /

diffbot

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

imagesiftbot

Rule Path
Disallow /

amazonbot

Rule Path
Disallow /

applebot

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /

*

Rule Path
Disallow /wp-admin/
Allow /wp-admin/admin-ajax.php

Comments

  • BLOCCO AI CRAWLER / DATA SCRAPER PER LLM
  • Common Crawl
  • ChatGPT user scraping
  • OpenAI GPTBot
  • Google Bard / VertexAI (NON blocca Google Search)
  • Anthropic (Claude)
  • Claude-Web
  • Cohere AI
  • Omgili scrapers
  • Perplexity AI
  • KUKA's youBot
  • Diffbot
  • ByteDance (TikTok)
  • ImageSift / TheHive
  • Amazon Alexa
  • Apple Siri / Spotlight
  • Meta / Facebook AI