Struct HnswGraph

Source
pub struct HnswGraph<D: DistanceMetric> {
    config: HnswConfig,
    metric: D,
    nodes: Vec<HnswNode>,
    entry_point: Option<u32>,
    max_level: usize,
    rng: SmallRng,
    sq_codes: Option<Vec<Vec<u8>>>,
    sq_params: (f32, f32),
}
Expand description

HNSW graph index for approximate nearest neighbor search.

Optionally stores scalar-quantized codes (uint8) for accelerated distance computation during construction and search. When enabled, candidate exploration uses fast integer distances on codes, while final neighbor selection uses exact float32 distances for quality. See RFC-005 §3.

Fields§

§config: HnswConfig§metric: D§nodes: Vec<HnswNode>§entry_point: Option<u32>§max_level: usize§rng: SmallRng§sq_codes: Option<Vec<Vec<u8>>>

Scalar-quantized codes: node_id → uint8 code (same dim as vectors). When Some, distance_fast uses integer arithmetic (~4× faster).

§sq_params: (f32, f32)

Quantization parameters: (min_val, scale) for encoding/decoding.

Implementations§

Source§

impl<D: DistanceMetric> HnswGraph<D>

Source

pub fn new(config: HnswConfig, metric: D) -> Self

Create a new empty HNSW graph.

Source

pub fn enable_scalar_quantization(&mut self, min_val: f32, max_val: f32)

Enable scalar quantization for accelerated distance computation.

When enabled, each inserted vector is also encoded as uint8 and candidate distances during search_layer use fast integer arithmetic. Final neighbor selection still uses exact float32 distances.

For L2-normalized embeddings (range [-1, 1]), use default parameters. For unnormalized data, provide the expected min/max range.

Source

pub fn disable_scalar_quantization(&mut self)

Disable scalar quantization (revert to exact distances only).

Source

pub fn is_quantized(&self) -> bool

Whether scalar quantization is active.

Source

fn encode_sq(vector: &[f32], min_val: f32, scale: f32) -> Vec<u8>

Encode a vector to uint8 using scalar quantization.

Source

fn distance_sq(a: &[u8], b: &[u8]) -> f32

Fast L2 distance on uint8 codes (auto-vectorized by LLVM).

Source

pub fn len(&self) -> usize

Number of vectors in the index.

Source

pub fn is_empty(&self) -> bool

Whether the index is empty.

Source

pub fn set_ef_construction(&mut self, ef: usize)

Set ef_construction at runtime (e.g., lower for bulk load, higher for incremental).

Set ef_search at runtime.

Source

pub(crate) fn random_level(&mut self) -> usize

Generate a random level for a new node.

Capped at 32 (supports up to M^32 ≈ 10^38 nodes). See RFC-002-07.

Source

fn max_neighbors(&self, level: usize) -> usize

Max neighbors allowed at a given level.

Source

pub(crate) fn push_node(&mut self, vector: &[f32], level: usize)

Allocate a node without connecting it (used by bulk_insert_parallel).

Source

pub(crate) fn connect_node( &mut self, id: u32, candidates: &[(u32, f32)], level: usize, )

Connect a pre-allocated node using pre-computed neighbor candidates.

Source

pub fn insert(&mut self, id: u32, vector: &[f32])

Insert a vector into the index.

The id should be a unique sequential identifier starting from 0. Panics if id != self.len() (must insert in order).

Source

fn greedy_closest(&self, start: u32, query: &[f32], level: usize) -> u32

Greedy search for the single closest node at a given level.

Source

fn search_layer( &self, entry: u32, query: &[f32], ef: usize, level: usize, ) -> Vec<(u32, f32)>

Beam search at a single level. Returns candidates sorted by distance (ascending).

When scalar quantization is enabled, candidate exploration uses fast uint8 distances. Final results are re-ranked with exact float32 distances.

Source

fn prune_neighbors(&mut self, node_id: u32, level: usize, max_n: usize)

Prune a node’s neighbor list to keep only the best max_n.

Uses heuristic selection when enabled (diverse directions), otherwise keeps the closest max_n by distance.

Source

pub fn search(&self, query: &[f32], k: usize) -> Vec<(u32, f32)>

Search for the k nearest neighbors of query.

Returns a Vec of (node_id, distance) sorted by distance ascending.

Source

pub fn search_filtered( &self, query: &[f32], k: usize, filter: impl Fn(u32) -> bool, ) -> Vec<(u32, f32)>

Search with a predicate filter.

Like search, but only returns nodes where filter(node_id) is true. The HNSW graph is still traversed through filtered-out nodes (they act as bridges), but only matching nodes appear in the results.

Source

pub fn vector(&self, node_id: u32) -> &[f32]

Get the stored vector for a node.

Source

pub fn config(&self) -> &HnswConfig

Get the configuration.

Source

pub fn entry_point(&self) -> Option<u32>

Get the entry point node ID.

Source

pub fn max_level(&self) -> usize

Get the maximum level in the graph.

Source

pub fn neighbors_at_level(&self, node_id: u32, level: usize) -> &[u32]

Get the neighbor list for a node at a specific level (public accessor).

Source

pub fn distance_to(&self, node_id: u32, query: &[f32]) -> f32

Compute distance between a stored node and a query vector (public accessor).

Source

pub fn all_neighbors(&self, node_id: u32) -> Vec<Vec<u32>>

Get all neighbor lists for a node (one per level).

Source

pub fn nodes_at_level(&self, level: usize) -> Vec<u32>

Return node IDs present at the given HNSW level (RFC-004).

These are the natural “hub” nodes of the graph hierarchy. Level 0 = all nodes, higher levels = fewer, more connected hubs. Count follows geometric distribution: ~N/M^level.

Source

pub fn assign_region(&self, vector: &[f32], level: usize) -> Option<u32>

Assign a vector to its nearest hub at the given level (RFC-004).

Uses greedy descent from the entry point — O(log N). Returns the node_id of the nearest hub at that level.

Source

fn distance(&self, node_id: u32, query: &[f32]) -> f32

Compute distance between a stored node and a query vector.

When scalar quantization is enabled, uses fast uint8 distances for candidate exploration (~4× faster). Falls back to exact float32 distance when SQ is disabled.

Source

fn distance_fast( &self, node_id: u32, query_code: Option<&[u8]>, query: &[f32], ) -> f32

Fast approximate distance using scalar-quantized codes.

Returns the exact distance if SQ is not enabled.

Source

fn neighbors_at(&self, node_id: u32, level: usize) -> &[u32]

Get the neighbor list for a node at a given level.

Source

pub fn count_reachable(&self) -> usize

Check graph invariant: all nodes are reachable from entry point at level 0.

Returns the number of reachable nodes. Should equal self.len().

Source

pub fn brute_force_knn(&self, query: &[f32], k: usize) -> Vec<(u32, f32)>

Brute-force kNN for ground truth comparison.

Returns (node_id, distance) sorted ascending. O(N) per query.

Source§

impl<D: DistanceMetric> HnswGraph<D>

Source

pub(crate) fn to_snapshot(&self) -> HnswSnapshot

Create a serializable snapshot (excludes metric and RNG).

Source

pub(crate) fn from_snapshot(snapshot: HnswSnapshot, metric: D) -> Self

Restore from a snapshot, providing the metric.

Auto Trait Implementations§

§

impl<D> Freeze for HnswGraph<D>
where D: Freeze,

§

impl<D> RefUnwindSafe for HnswGraph<D>
where D: RefUnwindSafe,

§

impl<D> Send for HnswGraph<D>

§

impl<D> Sync for HnswGraph<D>

§

impl<D> Unpin for HnswGraph<D>
where D: Unpin,

§

impl<D> UnwindSafe for HnswGraph<D>
where D: UnwindSafe,

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
§

impl<T> Pointable for T

§

const ALIGN: usize

The alignment of pointer.
§

type Init = T

The type for initializers.
§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

§

fn vzip(self) -> V