beta.mixcloud.com
robots.txt

Robots Exclusion Standard data for beta.mixcloud.com

Resource Scan

Scan Details

Site Domain beta.mixcloud.com
Base Domain mixcloud.com
Scan Status Ok
Last Scan2025-09-28T09:07:41+00:00
Next Scan 2025-10-05T09:07:41+00:00

Last Scan

Scanned2025-09-28T09:07:41+00:00
URL https://beta.mixcloud.com/robots.txt
Redirect https://www.mixcloud.com/robots.txt
Redirect Domain www.mixcloud.com
Redirect Base mixcloud.com
Domain IPs 104.20.4.36, 104.20.5.36, 2606:4700:10::6814:424, 2606:4700:10::6814:524
Redirect IPs 104.20.4.36, 104.20.5.36, 2606:4700:10::6814:424, 2606:4700:10::6814:524
Response IP 104.20.5.36
Found Yes
Hash 82efaac329225812603fcce58169d995a67a1b91c897d65689b85e96d6817f79
SimHash 519889408432

Groups

*

Rule Path
Allow /
Disallow /oauth/
Disallow /short/
Disallow /pigeon/
Disallow /blog/wp-json/
Disallow /blog/?rest_route=

gptbot
chatgpt-user
ccbot
perplexitybot
omgili
omgilibot
googleextended
anthropic-ai
claude-web
claudebot
cohere-ai
google-extended
google-cloudvertexbot
facebookbot
meta-externalagent
amazonbot
bytespider
applebot-extended
imagesiftbot
diffbot
timpibot

Rule Path
Allow /blog/
Disallow /
Disallow /blog/wp-json/
Disallow /blog/?rest_route=

Other Records

Field Value
sitemap https://sitemaps.mixcloud.com/sitemap_userprofile_index.xml
sitemap https://sitemaps.mixcloud.com/sitemap_cloudcast_index.xml
sitemap https://sitemaps.mixcloud.com/sitemap_tag_index.xml
sitemap https://sitemaps.mixcloud.com/sitemap_playlist_index.xml
sitemap https://www.mixcloud.com/blog/sitemap_index.xml

Comments

  • Block OpenAI
  • Block Common Crawl
  • Block Perplexity
  • Block Webz.io
  • Block Google Extended
  • Block Anthropic Anthropic
  • Block Cohere AI
  • Block Google-AI see https://blog.google/technology/ai/an-update-on-web-publisher-controls/
  • Facebook LLM see https://developers.facebook.com/docs/sharing/bot
  • Amazon Alexa see https://developer.amazon.com/amazonbot
  • Tiktok / ByteDance
  • Apple
  • imagesift.com
  • diffbot.com
  • https://timpi.io/