Skip to content

RFC-005: Query Capabilities & Ingestion Performance

See full RFC: design/CVX_RFC_005_Query_Capabilities.md

Summary

Foundational improvements to make CVX a practical temporal vector database:

Batch ingestion (P0) — bulk_insert() with NumPy arrays, target ≥5,000 pts/sec
Region members query (P1) — “which points belong to region R in time range T1-T2?”
Temporal neighbors (P2) — “who were entity X’s neighbors at each timestep?”
Trajectory similarity search (P2) — “find entities with similar temporal evolution”
Configurable ef_construction (P0) — tune build-time quality vs speed

Why

Current gap	Impact
450 pts/sec ingestion	37 min for 1M records — unusable
No region query	Regions are opaque, can’t inspect cluster contents
No trajectory comparison	Can’t find “entities that evolved like X”
Fixed ef_construction=200	Wastes computation during bulk load

None of Qdrant, Milvus, or Weaviate have temporal-native operations. That’s our differentiator — but only if the basics work fast.