findmine.com
robots.txt

Robots Exclusion Standard data for findmine.com

Resource Scan

Scan Details

Site Domain findmine.com
Base Domain findmine.com
Scan Status Ok
Last Scan2025-03-20T05:04:16+00:00
Next Scan 2025-04-19T05:04:16+00:00

Last Scan

Scanned2025-03-20T05:04:16+00:00
URL https://findmine.com/robots.txt
Domain IPs 2406:da18:b3d:e201::65, 2406:da18:b3d:e202::65, 52.74.232.59, 52.76.120.174
Response IP 52.74.232.59
Found Yes
Hash 70cbdf03c0865ef7c4116f7124a077d00abb622262d2b7c5cbfc60f5f1532f10
SimHash e4a118d2c524

Groups

*

Rule Path
Allow /llms.txt
Allow /
Allow /blog/
Allow /case-studies/
Allow /about/
Allow /demo/
Allow /contact-us/
Disallow /dev/
Disallow /staging/
Disallow /test/
Disallow /.env
Disallow /.git/
Disallow /dist/
Disallow /src/
Disallow /.bolt/
Disallow /admin/
Disallow /internal/
Disallow /dashboard/
Disallow /coverage/
Disallow /search
Disallow /*?query=
Disallow /*?filter=
Disallow /*?sort=
Disallow /drafts/
Disallow /tmp/
Disallow /temp/
Disallow /print/
Disallow /pdf/
Disallow /amp/

Other Records

Field Value
crawl-delay 10

gptbot

Rule Path
Allow /llms.txt
Disallow /

chatgpt-user

Rule Path
Allow /llms.txt
Disallow /

google-extended

Rule Path
Allow /llms.txt
Disallow /

ccbot

Rule Path
Allow /llms.txt
Disallow /

anthropic-ai

Rule Path
Allow /llms.txt
Disallow /

claude-web

Rule Path
Allow /llms.txt
Disallow /

omgilibot

Rule Path
Allow /llms.txt
Disallow /

omgili

Rule Path
Allow /llms.txt
Disallow /

Other Records

Field Value
sitemap https://www.findmine.com/sitemap.xml

Comments

  • robots.txt for FindMine Marketing Website
  • Last updated: March 2025
  • Allow all well-behaved bots
  • Explicitly allow access to llms.txt
  • Allow crawling of most content
  • Prevent crawling of development/staging areas
  • Prevent crawling of admin and internal areas
  • Prevent crawling of coverage reports
  • Prevent crawling of search results and filtered pages
  • Prevent crawling of temporary or draft content
  • Prevent indexing of duplicate content
  • Crawl-delay for rate limiting
  • Sitemap location
  • Special rules for specific bots
  • Block AI training crawlers but allow access to llms.txt