All benchmarks run on Apple M-series (ARM64), single-threaded unless noted. HNSW parameters: M=16, ef_construction=200. Criterion benchmarks use release mode; recall metrics use debug mode (conservative).
Dataset Metric ef_search Recall@10 Reachability 1K D=32 L2 50 1.000 100% 1K D=128 L2 50 0.977 100% 10K D=128 L2 200 0.966 100% 1K D=32 Cosine 50 1.000 100% 1K D=128 Cosine 50 0.989 100%
Key observations:
100% reachability across all configurations (heuristic neighbor selection, RFC-002-03)
Recall degrades gracefully with dimensionality, recoverable by increasing ef_search
Cosine and L2 show comparable recall at same parameters
Configuration Recall@10 1K D=32, alpha=1.0 (pure semantic) 1.000 1K D=32, Range[25K,75K], alpha=1.0 1.000 1K D=32, alpha=0.5 (mixed) 0.791
Lower recall at alpha=0.5 is expected — the composite distance prioritizes temporal proximity over semantic similarity, trading recall for temporal relevance.
Alpha Avg Temporal Distance 1.00 (pure semantic) 115,400 µs 0.75 85,900 µs 0.50 85,900 µs 0.25 44,000 µs 0.00 (pure temporal) 32,100 µs
As alpha decreases, results become temporally closer to the query timestamp — validating the composite distance formula d S T = α ⋅ d s e m + ( 1 − α ) ⋅ d t i m e d_{ST} = \alpha \cdot d_{sem} + (1-\alpha) \cdot d_{time} d S T = α ⋅ d se m + ( 1 − α ) ⋅ d t im e .
Dimension Latency Throughput D=32 4.5 ns 222M pairs/sec D=128 14.7 ns 68M pairs/sec D=256 33.3 ns 30M pairs/sec D=768 128.7 ns 7.8M pairs/sec D=1536 305.1 ns 3.3M pairs/sec
Dimension Latency Throughput D=32 10.4 ns 96M pairs/sec D=128 35.3 ns 28M pairs/sec D=256 87.2 ns 11M pairs/sec D=768 328.6 ns 3.0M pairs/sec D=1536 787.1 ns 1.3M pairs/sec
Dimension Latency Throughput D=32 3.9 ns 256M pairs/sec D=128 11.7 ns 85M pairs/sec D=256 28.0 ns 36M pairs/sec D=768 110.9 ns 9.0M pairs/sec D=1536 266.8 ns 3.7M pairs/sec
10K cosine pairs at D=768 : 3.39 ms (2.95M pairs/sec batch throughput)
Vectors Memory Bytes/Vector 1,000 2,016 B 2.016 10,000 8,208 B 0.821 100,000 16,408 B 0.164
Sub-byte per vector at scale — Roaring Bitmap compression becomes more effective with larger cardinality.
Configuration Compression Ratio D=768, M=8, K=256 384× Original: 3,072 bytes/vector PQ code: 8 bytes/vector
Scenario Result Stationary (200 pts, D=3) 0 false positives 1 planted CP (200 pts) Detected, severity=1.000 3 planted CPs (200 pts) Precision=1.0, Recall=1.0, F1=1.0
Metric Value Stationary FPR (500 obs) 0.0000 Detection latency 0 observations
Process Hurst Exponent Expected ADF Statistic Brownian motion (n=2000) 0.542 ≈0.5 -3.015 OU process (n=2000, θ=0.3) 0.589 <0.5 -22.227
Crate Tests Coverage cvx-core 47 Types, traits, config, errors cvx-index 81 HNSW, distance, temporal, concurrent cvx-storage 71 Memory, hot, warm, cold, WAL, tiered cvx-analytics 79 ODE, PELT, BOCPD, calculus, ML cvx-query 11 8 query types cvx-ingest 20 Validation pipeline cvx-api 18 REST integration (all endpoints) cvx-python 22 PyO3 bindings (pytest) Total 349 —
# Distance kernel benchmarks (Criterion)
cargo bench -p cvx-index --bench distance_kernels
# HNSW search benchmarks (Criterion)
cargo bench -p cvx-index --bench hnsw_search
# Analytics benchmarks (Criterion)
cargo bench -p cvx-analytics --bench analytics_bench
# Recall & metrics report
cargo test -p cvx-index --test metrics_report -- --nocapture
cargo test -p cvx-analytics --test metrics_report -- --nocapture