Skip to content

Semantic Regions

HNSW builds a multi-level graph where each level has fewer, more “central” nodes. These hubs emerge naturally and act as unsupervised cluster centroids. CVX assigns every node to its nearest hub via greedy descent — N/ML\sim N/M^L regions at level LL.

💡 No training required
Unlike k-means or DBSCAN, HNSW regions emerge from index construction. No hyperparameter tuning, no iterative optimization. O(N) single-pass assignment.

import chronos_vector as cvx
import numpy as np
np.random.seed(42)
index = cvx.TemporalIndex(m=4, ef_construction=50, ef_search=50)
for cluster in range(3):
center = np.random.randn(16).astype(np.float32) * 2
for entity in range(10):
for t in range(20):
drift = np.random.randn(16).astype(np.float32) * 0.1 * t
vec = center + drift + np.random.randn(16).astype(np.float32) * 0.3
index.insert(cluster * 100 + entity, t * 86400, vec.tolist())
print(f"Index: {len(index)} points")
Index: 600 points
assignments = index.region_assignments(level=1)
for hub_id, members in sorted(assignments.items(), key=lambda x: -len(x[1]))[:3]:
clusters = {}
for eid, ts in members:
clusters[eid // 100] = clusters.get(eid // 100, 0) + 1
dominant = max(clusters.values())
print(f" Hub {hub_id}: {len(members)} members, purity={dominant/len(members):.0%}")
Hub 38: 32 members, purity=100%
Hub 161: 28 members, purity=100%
Hub 301: 26 members, purity=100%

Track membership evolution with exponential smoothing: s(t)=αc(t)+(1α)s(t1)\mathbf{s}(t) = \alpha \cdot \mathbf{c}(t) + (1-\alpha) \cdot \mathbf{s}(t-1)

Compare region distributions with geometry-aware metrics:

MetricFormulaRange
Fisher-Rao2arccos(ipiqi)2\arccos(\sum_i \sqrt{p_i q_i})[0,π][0, \pi]
Hellinger12(piqi)2\frac{1}{\sqrt{2}}\sqrt{\sum(\sqrt{p_i}-\sqrt{q_i})^2}[0,1][0, 1]
WassersteinOptimal transport with region centroids[0,)[0, \infty)