Skip to content

Table of Contents

  1. Features
  2. Installation
  3. Quick Start
  4. Examples
  5. Brute-Force Index
  6. HNSW Index
  7. Thread-Safe Index
  8. Benchmark Results
  9. API Reference
  10. Development & CI
  11. GPU Acceleration
  12. Documentation
  13. Contributing
  14. License

Annie Examples

Interactive Examples:

You can now run selected code blocks directly in your browser! Click the Try it button above a code block to execute it. Use sliders to adjust parameters like vector dimension or dataset size. Powered by Pyodide (Python in the browser). Learn more.

Basic Usage

128 1000
import numpy as np
from rust_annie import AnnIndex, Distance

dim = {{dim|128}}
size = {{size|1000}}

# Create index
index = AnnIndex(dim, Distance.EUCLIDEAN)

# Generate and add data
data = np.random.rand(size, dim).astype(np.float32)
ids = np.arange(size, dtype=np.int64)
index.add(data, ids)

# Single query
query = np.random.rand(dim).astype(np.float32)
neighbor_ids, distances = index.search(query, k=5)
print(neighbor_ids, distances)

# Batch queries
queries = np.random.rand(10, dim).astype(np.float32)
batch_ids, batch_dists = index.search_batch(queries, k=3)
print(batch_ids.shape, batch_dists.shape)

# Create index with sample data
index = AnnIndex(3, Distance.EUCLIDEAN)
data = np.array([
    [1.0, 2.0, 3.0],
    [4.0, 5.0, 6.0],
    [7.0, 8.0, 9.0]

## Filtered Search
<div class="interactive-block" data-interactive>
```python
import numpy as np
from rust_annie import AnnIndex, Distance

# Create index with sample data
index = AnnIndex(3, Distance.EUCLIDEAN)
data = np.array([
    [1.0, 2.0, 3.0],
    [4.0, 5.0, 6.0],
    [7.0, 8.0, 9.0]
], dtype=np.float32)
ids = np.array([10, 20, 30], dtype=np.int64)
index.add(data, ids)

# Define filter function
def even_ids(id: int) -> bool:
    return id % 2 == 0

# Filtered search
query = np.array([1.0, 2.0, 3.0], dtype=np.float32)
filtered_ids, filtered_dists = index.search_filter_py(query, k=3, filter_fn=even_ids)
print(filtered_ids)
], dtype=np.float32) ids = np.array([10, 20, 30], dtype=np.int64) index.add(data, ids)

Define filter function

def even_ids(id: int) -> bool: return id % 2 == 0

Filtered search

query = np.array([1.0, 2.0, 3.0], dtype=np.float32) filtered_ids, filtered_dists = index.search_filter_py(query, k=3, filter_fn=even_ids)

Only IDs 10 and 30 will be returned (20 is odd)

## HNSW Index
```python

## HNSW Index
<div class="interactive-block" data-interactive>
<div class="interactive-controls">
<label>Dimension: <input type="range" min="8" max="256" value="128" class="slider" data-var="dim" /></label>
<span class="slider-value" data-var="dim">128</span>
<label>Dataset size: <input type="range" min="1000" max="200000" value="100000" class="slider" data-var="size" /></label>
<span class="slider-value" data-var="size">100000</span>
</div>
```python
import numpy as np
from rust_annie import PyHnswIndex

dim = {{dim|128}}
size = {{size|100000}}

# Create HNSW index
index = PyHnswIndex(dims=dim)

# Add large dataset
data = np.random.rand(size, dim).astype(np.float32)
ids = np.arange(size, dtype=np.int64)
index.add(data, ids)

# Fast approximate search
query = np.random.rand(dim).astype(np.float32)
neighbor_ids, _ = index.search(query, k=10)
print(neighbor_ids)
from rust_annie import PyHnswIndex

Create HNSW index

index = PyHnswIndex(dims=128)

Add large dataset

data = np.random.rand(100000, 128).astype(np.float32) ids = np.arange(100000, dtype=np.int64) index.add(data, ids)

Fast approximate search

query = np.random.rand(128).astype(np.float32) neighbor_ids, _ = index.search(query, k=10)

## Saving and Loading
```python
# Create and save index
index = AnnIndex(64, Distance.COSINE)
data = np.random.rand(500, 64).astype(np.float32)
ids = np.arange(500, dtype=np.int64)
index.add(data, ids)
index.save("my_index")

# Load index
loaded_index = AnnIndex.load("my_index")

Thread-safe Operations

from rust_annie import ThreadSafeAnnIndex, Distance
from concurrent.futures import ThreadPoolExecutor

index = ThreadSafeAnnIndex(256, Distance.MANHATTAN)

# Concurrent writes
with ThreadPoolExecutor() as executor:
    for i in range(10):
        data = np.random.rand(100, 256).astype(np.float32)
        ids = np.arange(i*100, (i+1)*100, dtype=np.int64)
        executor.submit(index.add, data, ids)

# Concurrent reads
with ThreadPoolExecutor() as executor:
    futures = []
    for _ in range(100):
        query = np.random.rand(256).astype(np.float32)
        futures.append(executor.submit(index.search, query, k=3))

    results = [f.result() for f in futures]

Minkowski Distance

# Create index with custom distance
index = AnnIndex.new_minkowski(dim=64, p=2.5)
data = np.random.rand(200, 64).astype(np.float32)
ids = np.arange(200, dtype=np.int64)
index.add(data, ids)

# Search with Minkowski distance
query = np.random.rand(64).astype(np.float32)
ids, dists = index.search(query, k=5)

README

Annie

PyPI
CI License: MIT Benchmark GPU Support Documentation

A lightning-fast, Rust-powered Approximate Nearest Neighbor library for Python with multiple backends, thread-safety, and GPU acceleration.

Table of Contents

  1. Features
  2. Installation
  3. Quick Start
  4. Examples
  5. Brute-Force Index
  6. HNSW Index
  7. Thread-Safe Index
  8. Benchmark Results
  9. API Reference
  10. Development & CI
  11. GPU Acceleration
  12. Documentation
  13. Contributing
  14. License

Features

Installation

# Stable release from PyPI:
pip install rust-annie

# Install with GPU support (requires CUDA):
pip install rust-annie[gpu]

# Or install from source:
git clone https://github.com/Programmers-Paradise/Annie.git
cd Annie
pip install maturin
maturin develop --release

Quick Start

Brute-Force Index

import numpy as np
from rust_annie import AnnIndex, Distance

# Create index
index = AnnIndex(128, Distance.EUCLIDEAN)

# Add data
data = np.random.rand(1000, 128).astype(np.float32)
ids = np.arange(1000, dtype=np.int64)
index.add(data, ids)

# Search
query = np.random.rand(128).astype(np.float32)
neighbor_ids, distances = index.search(query, k=5)

HNSW Index

from rust_annie import PyHnswIndex

index = PyHnswIndex(dims=128)
data = np.random.rand(10000, 128).astype(np.float32)
ids = np.arange(10000, dtype=np.int64)
index.add(data, ids)

# Search
query = np.random.rand(128).astype(np.float32)
neighbor_ids, _ = index.search(query, k=10)

Examples

Brute-Force Index

from rust_annie import AnnIndex, Distance
import numpy as np

# Create index
idx = AnnIndex(4, Distance.COSINE)

# Add data
data = np.random.rand(50, 4).astype(np.float32)
ids = np.arange(50, dtype=np.int64)
idx.add(data, ids)

# Search
labels, dists = idx.search(data[10], k=3)
print(labels, dists)

Batch Query

from rust_annie import AnnIndex, Distance
import numpy as np

# Create index
idx = AnnIndex(16, Distance.EUCLIDEAN)

# Add data
data = np.random.rand(1000, 16).astype(np.float32)
ids = np.arange(1000, dtype=np.int64)
idx.add(data, ids)

# Batch search
queries = data[:32]
labels_batch, dists_batch = idx.search_batch(queries, k=10)
print(labels_batch.shape)  # (32, 10)

Thread-Safe Index

from rust_annie import ThreadSafeAnnIndex, Distance
import numpy as np
from concurrent.futures import ThreadPoolExecutor

# Create thread-safe index
idx = ThreadSafeAnnIndex(32, Distance.EUCLIDEAN)

# Add data
data = np.random.rand(500, 32).astype(np.float32)
ids = np.arange(500, dtype=np.int64)
idx.add(data, ids)

# Concurrent searches
def task(q):
    return idx.search(q, k=5)

queries = np.random.rand(100, 32).astype(np.float32)
with ThreadPoolExecutor(max_workers=8) as executor:
    futures = [executor.submit(task, q) for q in queries]
    for f in futures:
        print(f.result())

Filtered Search

from rust_annie import AnnIndex, Distance
import numpy as np

# Create index
index = AnnIndex(3, Distance.EUCLIDEAN)
data = np.array([
    [1.0, 2.0, 3.0],
    [4.0, 5.0, 6.0],
    [7.0, 8.0, 9.0]
], dtype=np.float32)
ids = np.array([10, 20, 30], dtype=np.int64)
index.add(data, ids)

# Filter function
def even_ids(id: int) -> bool:
    return id % 2 == 0

# Filtered search
query = np.array([1.0, 2.0, 3.0], dtype=np.float32)
filtered_ids, filtered_dists = index.search_filter_py(
    query, 
    k=3, 
    filter_fn=even_ids
)
print(filtered_ids)  # [10, 30] (20 is filtered out)

Build and Query a Brute-Force AnnIndex in Python (Complete Example)

This section demonstrates a complete, beginner-friendly example of how to build and query a brute-force AnnIndex using Python.

Measured on a 6-core CPU:

That’s a \~4× speedup vs. NumPy!

Operation Dataset Size Time (ms) Speedup vs Python
Single Query (Brute) 10,000 × 64 0.7
Batch Query (64) 10,000 × 64 0.23 12×
HNSW Query 100,000 × 128 0.05 56×
View Full Benchmark Dashboard →

You’ll find:

API Reference

AnnIndex

Create a brute-force k-NN index.

Enum: Distance.EUCLIDEAN, Distance.COSINE, Distance.MANHATTAN

ThreadSafeAnnIndex

Same API as AnnIndex, safe for concurrent use.

Core Classes

Class Description
AnnIndex Brute-force exact search
PyHnswIndex Approximate HNSW index
ThreadSafeAnnIndex Thread-safe wrapper for AnnIndex
Distance Distance metrics (Euclidean, Cosine, etc)

Key Methods

Method Description
add(data, ids) Add vectors to index
search(query, k) Single query search
search_batch(queries, k) Batch query search
search_filter_py(query, k, filter_fn) Filtered search
save(path) Save index to disk
load(path) Load index from disk

Development & CI

CI runs on GitHub Actions, building wheels on Linux, Windows, macOS, plus:

# Run tests
cargo test
pytest tests/

# Run benchmarks
python scripts/benchmark.py
python scripts/batch_benchmark.py

# Generate documentation
mkdocs build

CI pipeline includes: - Cross-platform builds (Linux, Windows, macOS) - Unit tests and integration tests - Performance benchmarking - Documentation generation

Benchmark Automation

Benchmarks are tracked over time using:

GPU Acceleration

Enable GPU in Rust

Enable CUDA support for brute-force calculations:

# Install with GPU support
pip install rust-annie[gpu]

# Or build from source with GPU features
maturin develop --release --features gpu

Supported operations: - Batch L2 distance calculations - High-dimensional similarity search

Requirements: - NVIDIA GPU with CUDA support - CUDA Toolkit installed

Contributing

Contributions are welcome! Please:

See the main CONTRIBUTING guide for details.

License

This project is licensed under the MIT License. See LICENSE for details.