dr-weinberg.com
robots.txt

Robots Exclusion Standard data for dr-weinberg.com

Resource Scan

Scan Details

Site Domain dr-weinberg.com
Base Domain dr-weinberg.com
Scan Status Ok
Last Scan2025-11-07T11:56:30+00:00
Next Scan 2025-12-07T11:56:30+00:00

Last Scan

Scanned2025-11-07T11:56:30+00:00
URL https://dr-weinberg.com/robots.txt
Domain IPs 104.26.2.182, 104.26.3.182, 172.67.74.174, 2606:4700:20::681a:2b6, 2606:4700:20::681a:3b6, 2606:4700:20::ac43:4aae
Response IP 172.67.74.174
Found Yes
Hash c4af0955e7bdd798d46c5ff81b9c88d79fa04565da34b325e0fbb99a0a62d809
SimHash 7077d14166f8

Groups

*

Rule Path
Disallow /wp-admin/
Allow /wp-admin/admin-ajax.php
Disallow /wp-login.php
Disallow /wp-signup.php
Disallow /wp-register.php
Disallow /xmlrpc.php
Disallow /trackback/
Disallow /wp-config.php
Disallow /wp-settings.php
Disallow /search/
Disallow /*?s=
Disallow /*%26s%3D
Disallow /*?p=*
Disallow /*?attachment_id=*
Disallow /*?replytocom=*
Disallow /*?preview=*
Disallow /feed/
Disallow */feed/
Disallow */comments/feed/
Disallow /*/*/feed/
Disallow /author/
Disallow /tag/
Disallow /page/*/
Disallow /*?year=*
Disallow /*?monthnum=*
Disallow /*?day=*
Disallow /*?hour=*
Disallow /*?minute=*
Disallow /*?second=*
Disallow /*?w=*
Disallow /*?m=*
Allow /wp-content/uploads/
Allow /wp-content/themes/
Allow /wp-content/plugins/
Allow /wp-includes/
Allow *.css$
Allow *.js$
Allow *.jpg$
Allow *.jpeg$
Allow *.png$
Allow *.gif$
Allow *.webp$
Allow *.svg$
Allow *.woff$
Allow *.woff2$
Allow *.ttf$
Allow *.eot$
Allow /articles/
Allow /surgeries/
Allow /surgeries_cat/
Allow *.pdf$

Other Records

Field Value
crawl-delay 5

googlebot

Rule Path
Disallow /wp-admin/
Allow /wp-admin/admin-ajax.php

googlebot-image

Rule Path
Allow /

googlebot-mobile

Rule Path
Disallow /wp-admin/
Allow /wp-admin/admin-ajax.php

adsbot-google

Rule Path
Disallow

adsbot-google-mobile

Rule Path
Disallow

bingbot

Rule Path
Disallow /wp-admin/
Allow /wp-admin/admin-ajax.php

yandex

Rule Path
Disallow /wp-admin/
Allow /wp-admin/admin-ajax.php

Other Records

Field Value
crawl-delay 5

baiduspider

Rule Path
Disallow /wp-admin/
Allow /wp-admin/admin-ajax.php

Other Records

Field Value
crawl-delay 10

oai-searchbot

Rule Path
Allow /

chatgpt-user

Rule Path
Allow /

chatgpt-user/2.0

Rule Path
Allow /

perplexitybot

Rule Path
Allow /

youbot

Rule Path
Allow /

neevabot

Rule Path
Allow /

phind

Rule Path
Allow /

komobot

Rule Path
Allow /

quora-result-bot

Rule Path
Allow /

gptbot

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

applebot-extended

Rule Path
Disallow /

amazonbot

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /

facebookexternalhit

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 30

bytespider

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

diffbot

Rule Path
Disallow /

timpibot

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

petalbot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

semrushbot-sa

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

megaindex.ru

Rule Path
Disallow /

serpstatbot

Rule Path
Disallow /

seokicks-robot

Rule Path
Disallow /

getintentcrawler

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

backlinkcrawler

Rule Path
Disallow /

screaming frog seo spider

Rule Path
Disallow /

dataforseobot

Rule Path
Disallow /

moz

Rule Path
Disallow /

rogerbot

Rule Path
Disallow /

zoombot

Rule Path
Disallow /

seolyticscrawler

Rule Path
Disallow /

Other Records

Field Value
sitemap https://dr-weinberg.com/sitemap_index.xml

Comments

  • Robots.txt for Dr. Avi Weinberg - Plastic Surgery Website
  • Last Updated: July 2025
  • Website: https://dr-weinberg.com
  • =================================================
  • DEFAULT RULES - ALL SEARCH ENGINES
  • WordPress Core Protection
  • Search and Query Parameters
  • Feed URLs
  • Archive and Pagination
  • Allow Essential Resources (No wildcards in paths)
  • Medical Content - Allow (Important for medical SEO)
  • Hebrew content is allowed by default
  • PDF Documents - Allow (Medical information)
  • Crawl Delay (for bots that support it)
  • =================================================
  • GOOGLE SPECIFIC
  • =================================================
  • Note: Google ignores crawl-delay directive
  • Critical for before/after photos visibility
  • =================================================
  • OTHER MAJOR SEARCH ENGINES
  • =================================================
  • Note: Bing ignores crawl-delay directive
  • =================================================
  • AI SEARCH ENGINES - ALLOWED (Medical Visibility)
  • =================================================
  • OpenAI Search
  • ChatGPT Browsing (User-initiated, not training)
  • Perplexity AI Search
  • You.com Search
  • Additional AI Search Engines
  • =================================================
  • AI TRAINING BOTS - BLOCKED
  • =================================================
  • =================================================
  • SEO TOOLS & SCRAPERS - BLOCKED
  • =================================================
  • =================================================
  • SITEMAPS
  • =================================================
  • =================================================
  • IMPLEMENTATION NOTES
  • =================================================
  • 1. WordPress structure protection with rendering support
  • 2. All content is indexable (per site analysis)
  • 3. Special allowance for medical content and images
  • 4. Hebrew URLs supported (allowed by default)
  • 5. Before/after photos allowed (by default)
  • 6. AI search visibility for medical queries
  • 7. AI training data collection blocked
  • 8. Lower crawl-delay (5) for smaller site
  • 9. No wildcards in path segments
  • 10. Font files explicitly allowed
  • Based on July 2025 analysis: 306/306 pages indexable
  • No non-indexable pages identified
  • =================================================