thespoken.cc
robots.txt

Robots Exclusion Standard data for thespoken.cc

Resource Scan

Scan Details

Site Domain thespoken.cc
Base Domain thespoken.cc
Scan Status Ok
Last Scan2025-10-05T18:16:09+00:00
Next Scan 2025-10-12T18:16:09+00:00

Last Scan

Scanned2025-10-05T18:16:09+00:00
URL https://thespoken.cc/robots.txt
Domain IPs 162.159.135.42
Response IP 162.159.135.42
Found Yes
Hash c4b487af015ad9f3f88bfa013d82086ddb0cc0f023b72d593cfbff75f355a344
SimHash 562a5c406439

Groups

googlebot

Rule Path
Allow /

bingbot

Rule Path
Allow /

duckduckbot

Rule Path
Allow /

ahrefsbot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

dataforseobot

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

scrapy

Rule Path
Disallow /

python-urllib

Rule Path
Disallow /

curl

Rule Path
Disallow /

wget

Rule Path
Disallow /

yandex

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

*

Rule Path
Disallow /wp-admin/
Allow /wp-admin/admin-ajax.php
Disallow /wp-includes/
Disallow /wp-content/plugins/
Disallow /wp-content/themes/
Disallow /wp-json/
Disallow /refer/
Disallow /cgi-bin/
Disallow /trackback/
Disallow /*/trackback/
Disallow /xmlrpc.php
Disallow /wp-config.php
Disallow /.htaccess
Disallow /author/
Disallow /?author=*
Disallow /feed/
Disallow /*/feed/
Disallow /?s=
Disallow /*?*s=
Disallow /*?*p=
Disallow /*?attachment_id=*
Disallow /*?utm_source=*
Disallow /*?utm_medium=*
Disallow /*?utm_campaign=*
Disallow /*?utm_term=*
Disallow /*?utm_content=*
Disallow /*?replytocom
Disallow /*.php$
Disallow /*.cgi$
Allow /wp-content/uploads/

Other Records

Field Value
crawl-delay 5

Other Records

Field Value
sitemap https://www.thespoken.cc/sitemap_index.xml

Comments

  • WordPress robots.txt - Enhanced Version
  • Last updated: August 12, 2025
  • ====================
  • ALLOW MAJOR SEARCH ENGINES
  • ====================
  • ====================
  • BLOCK SEO TOOLS & SCRAPERS
  • ====================
  • ====================
  • BLOCK AI CRAWLERS
  • ====================
  • User-Agent: ClaudeBot
  • Disallow: /
  • User-Agent: Anthropic-AI
  • Disallow: /
  • User-Agent: Claude-Web
  • Disallow: /
  • ====================
  • BLOCK COMMON SCRAPING TOOLS
  • ====================
  • ====================
  • BLOCK AGGRESSIVE SEARCH BOTS
  • ====================
  • ====================
  • GENERAL RULES FOR ALL OTHER BOTS
  • ====================
  • ====================
  • SITEMAP
  • ====================