khamaleonlab.com
robots.txt

Robots Exclusion Standard data for khamaleonlab.com

Resource Scan

Scan Details

Site Domain khamaleonlab.com
Base Domain khamaleonlab.com
Scan Status Failed
Failure StageFetching resource.
Failure ReasonServer returned a client error.
Last Scan2025-09-24T10:24:27+00:00
Next Scan 2025-10-01T10:24:27+00:00

Last Successful Scan

Scanned2025-09-09T06:50:09+00:00
URL https://khamaleonlab.com/robots.txt
Domain IPs 2a02:4780:38:6d36:b26:f65c:d7fc:7578, 2a02:4780:39:8300:360e:5b08:b130:1da4, 77.37.66.227, 77.37.75.189
Response IP 77.37.115.207
Found Yes
Hash b3e675aa5f317417a5f855ec8affbaa46c01c0c242a25dd3044a171fe5fc2f46
SimHash 21181617ccf1

Groups

*

Product Comment
* Applies to all web crawlers
Rule Path
Allow /assets/css/
Allow /assets/js/
Allow /assets/fonts/
Allow /assets/images/

Other Records

Field Value
sitemap https://khamaleonlab.com/sitemap.xml

Comments

  • ==============================================================================
  • robots.txt for KhamaleonLab (https://khamaleonlab.com)
  • Generated on: 2025-06-26
  • ==============================================================================
  • --- PUBLIC ASSETS ---
  • It's generally good practice to allow crawlers access to your public CSS, JS,
  • and image files, as they can help them understand and render your pages.
  • Adjust these if your public asset paths are different.
  • Allow: /uploads/ # Uncomment if you have a public 'uploads' directory that should be crawled
  • --- DISALLOW SPECIFIC PUBLIC URL PATHS ---
  • Add 'Disallow' rules for any public URLs or URL patterns within your
  • `public_html` directory that you do NOT want crawlers to access or index.
  • This is typically for pages like internal search results, filtering results
  • that might create duplicate content, or specific campaign landing pages
  • you don't want indexed directly.
  • For sensitive access points (e.g., admin login pages):
  • KS-Semilla encourages using unique, non-guessable slugs for such pages,
  • configured within `config/reg_app_pages...`. Since these slugs should be
  • unique to your site and not publicly linked from areas you want crawled,
  • explicitly listing them in robots.txt (even with Disallow) might inadvertently
  • reveal their existence if the file is probed.
  • The primary protection for such areas should be robust authentication, authorization,
  • and the non-predictability of their access slugs if obscurity is also a goal.
  • If, despite using a unique slug, you still wish to explicitly Disallow it,
  • you can add it below, ensuring the path matches your unique public slug.
  • Examples of what you MIGHT disallow:
  • Disallow: /search-results? # Often good to disallow internal search result pages
  • Disallow: /*?filter= # Example: Disallow URLs with specific filter parameters
  • Disallow: /internal-tool/ # If you have a public-facing tool for internal use only
  • Disallow: /specific-campaign-landing-page-to-hide$
  • --- SITEMAP ---
  • Provide the full, absolute URL to your sitemap.xml file.
  • Uncomment and update the URL below once your sitemap is generated and available.
  • --- CRAWL DELAY (Optional - Use with extreme caution) ---
  • It's generally recommended NOT to set a crawl-delay. Major crawlers like
  • Googlebot manage their crawl rate automatically. Setting this can unnecessarily
  • slow down the indexing of your site.
  • Crawl-delay: 1