RFC-005: Query Capabilities & Ingestion Performance
See full RFC: design/CVX_RFC_005_Query_Capabilities.md
Summary
Section titled “Summary”Foundational improvements to make CVX a practical temporal vector database:
- Batch ingestion (P0) — bulk_insert() with NumPy arrays, target ≥5,000 pts/sec
- Region members query (P1) — “which points belong to region R in time range T1-T2?”
- Temporal neighbors (P2) — “who were entity X’s neighbors at each timestep?”
- Trajectory similarity search (P2) — “find entities with similar temporal evolution”
- Configurable ef_construction (P0) — tune build-time quality vs speed
| Current gap | Impact |
|---|---|
| 450 pts/sec ingestion | 37 min for 1M records — unusable |
| No region query | Regions are opaque, can’t inspect cluster contents |
| No trajectory comparison | Can’t find “entities that evolved like X” |
| Fixed ef_construction=200 | Wastes computation during bulk load |
None of Qdrant, Milvus, or Weaviate have temporal-native operations. That’s our differentiator — but only if the basics work fast.