creyalearning.com
robots.txt

Robots Exclusion Standard data for creyalearning.com

Resource Scan

Scan Details

Site Domain creyalearning.com
Base Domain creyalearning.com
Scan Status Ok
Last Scan2025-08-31T16:17:39+00:00
Next Scan 2025-09-30T16:17:39+00:00

Last Scan

Scanned2025-08-31T16:17:39+00:00
URL https://creyalearning.com/robots.txt
Domain IPs 2a02:4780:84:8147:950d:9f12:81ee:522e, 2a02:4780:84:bd30:76be:8ffc:1f36:2f28, 84.32.84.125, 84.32.84.237
Response IP 93.127.201.12
Found Yes
Hash 77cbfbbc083157184382c98b864c5ba422ab4893246f0bf72093cf000473538a
SimHash 11349d430532

Groups

*

Rule Path
Disallow

gptbot

Rule Path
Disallow

claudebot

Rule Path
Disallow

perplexitybot

Rule Path
Disallow

google-extended

Rule Path
Disallow

amazonbot

Rule Path
Disallow

ccbot

Rule Path
Disallow

facebookbot

Rule Path
Disallow

applebot

Rule Path
Disallow

ia_archiver

Rule Path
Disallow

Other Records

Field Value
sitemap https://creyalearning.com/sitemap.xml

Comments

  • robots.txt for allowing all crawlers including AI bots
  • Allow OpenAI's GPTBot
  • Allow Anthropic's ClaudeBot
  • Allow Perplexity.ai's crawler
  • Allow Google's AI bots
  • Allow Amazon's AI bot
  • Allow Common Crawl (used by many LLMs)
  • Allow Facebook's AI crawler
  • Allow Applebot (used for Siri and Spotlight)
  • Optional: Allow Archive.org (Wayback Machine)