Semble: Fast, Accurate Code Retrieval with 98% Fewer Tokens

Stephan and Thomas from MinishLab recently open-sourced Semble, a new approach to code search designed specifically for large language model (LLM) agents. The project addresses a common pain point when using AI coding assistants like Claude Code: the tendency to fall back on inefficient methods like grep that consume excessive tokens while often missing relevant information.

How Semble Works

Semble combines several techniques for optimal performance:

  • Static Model2Vec embeddings: Using a compact 16M parameter model (potion-code-16M) to create vector representations of code
  • BM25: A ranking function that favors terms appearing frequently in documents but rarely across the entire corpus
  • RRF fusion: Combining multiple retrieval methods for improved accuracy
  • Code-aware reranking: Applying additional signals specific to programming language constructs

This architecture allows Semble to operate entirely on CPUs, making it accessible without requiring GPUs or specialized hardware.

Performance Benchmarks

In rigorous testing across 63 repositories and 19 languages:

  • Semble uses 98% fewer tokens than traditional grep approaches while reading entire files
  • It achieves 99% of the retrieval quality of much larger transformer models (up to 137M parameters)
  • Indexing a typical repository takes only 250 milliseconds
  • Queries are processed in just 1.5 milliseconds on CPU

These results demonstrate that Semble offers a significant improvement in both efficiency and accuracy compared to existing solutions.

Key Features

  • Token-efficient: Dramatically reduces token usage for cost savings and faster responses
  • Fast indexing and querying: Enables real-time code search across large projects
  • High accuracy: Delivers precise results with minimal false positives
  • Zero configuration: No API keys or complex setup required
  • Drop-in replacement: Compatible with existing AI coding workflows via MCP server

Getting Started

Users can install Semble directly into Claude Code with a single command:

claude mcp add semble -s user -- uvx --from "semble[mcp]" semble

The team encourages users to explore the full documentation, benchmarks, and methodology on their GitHub repository: https://github.com/MinishLab/semble

Semble represents a promising advancement in code search technology that could significantly improve the efficiency and effectiveness of AI-powered development tools.