Skip to content

Technical Product Requirements

Version: 1.0 Author: Manuel Couto Pintos Date: March 2026 Status: Draft


ChronosVector es una base de datos vectorial temporal en Rust que trata el tiempo como una dimensión geométrica del espacio de embeddings, permitiendo búsquedas espacio-temporales, análisis de drift semántico y predicción de trayectorias vectoriales.

User PersonaUse CaseValor Principal
ML EngineerMonitorizar drift de embeddings en producciónAlertas cuando un modelo se degrada por cambio en distribución de datos
NLP ResearcherEstudiar evolución semántica diacrónicaQueries de trayectoria, analogía temporal, velocidad de cambio
Recommender System DeveloperPredecir hacia dónde evolucionan los intereses de usuariosExtrapolación de vectores de usuario via Neural ODE
Knowledge Graph EngineerTemporal knowledge graph completionCuadrupletos (entity, relation, entity, time) indexados nativamente
Data ScientistAnálisis exploratorio de series de embeddingsDetección de change points, drift quantification, cohort divergence
  • Not a general-purpose vector database. CVX no compite con Qdrant/Milvus/Pinecone en kNN puro sin componente temporal. Si el usuario no necesita tiempo, debería usar Qdrant.
  • Not a streaming platform. CVX recibe vectores, no los produce. No incluye modelos de embedding.
  • Not a model training framework. El Neural ODE se entrena externamente o con un módulo auxiliar; CVX es primariamente infraestructura de servicio (inference + storage).
  • Not a distributed database (initially). Phase 1-4 son single-node. La distribución es Phase 5.

AttributeSpecification
Input(entity_id: u64, timestamp: i64, vector: [f32; D], metadata: Map<String, Value>)
ProtocolsREST (batch), gRPC (bidirectional stream)
Throughput target≥ 50,000 vectors/second (single node, D=768)
Latency targetp99 < 5ms per vector (streaming mode)
DurabilityWrite-ahead log fsync antes de ack. No data loss on crash.
ValidationDimension consistency per entity, timestamp monotonically increasing per entity, vector norm finite
Delta encodingAutomatic. Keyframe every K=10 updates. Configurable threshold ε.
IdempotencyRe-insert same (entity_id, timestamp) is no-op if vector hash matches; update if different.
AttributeSpecification
Input(query_vector, k, timestamp, metric, alpha)
OutputTop-k results sorted by combined spatiotemporal distance
Latency targetp99 < 10ms (1M vectors, D=768)
Recall target≥ 95% at 10-recall@10
Metrics supportedCosine, L2, Dot Product, Poincaré (hyperbolic)
Alpha range[0.0, 1.0] — 0.0 = pure temporal, 1.0 = pure semantic
AttributeSpecification
Input(query_vector, k, time_range: [t1, t2], metric, alpha)
OutputTop-k results within time range
BehaviorPre-filter by Roaring Bitmap, then search within valid set
AttributeSpecification
Input(entity_id, time_range: [t1, t2])
OutputOrdered sequence of TemporalPoint for that entity within range
ReconstructionTransparent delta decoding. Caller receives full vectors.
LatencyProportional to trajectory length. Target: < 1ms + 0.1ms per point.
AttributeSpecification
Input(entity_id, timestamp)
OutputVelocity vector ∂v/∂t and optionally acceleration ∂²v/∂t²
MethodFinite differences over stored deltas (first-order: central difference; second-order: second central difference)
Edge caseIf entity has < 2 points: return error InsufficientData

FR-06: Trajectory Prediction (Extrapolation)

Section titled “FR-06: Trajectory Prediction (Extrapolation)”
AttributeSpecification
Input(entity_id, target_timestamp, confidence_level)
OutputPredictedPoint { vector, confidence_interval, uncertainty_per_dimension }
MethodNeural ODE solver (Dormand-Prince RK45) with learned f_θ
FallbackIf Neural ODE not available/trained: linear extrapolation from last two points
Cold startEntity needs ≥ 5 historical points for Neural ODE. Below that: linear only.
AttributeSpecification
Input (offline)(entity_id, time_range, method: PELT, sensitivity)
Input (online)Automatic during ingestion. Configurable per-entity or global.
OutputVec<ChangePoint { timestamp, severity, drift_vector, method }>
PELT specificsPenalty: BIC (default) or AIC. Minimum segment length configurable.
BOCPD specificsHazard function: constant (default). Prior: Normal-InverseGamma. Threshold configurable.
AttributeSpecification
Input(reference_entity, t_reference, t_target, k)
Semantics”What entities at t_target occupied the same semantic role as reference at t_reference?”
MethodCompute displacement, project query, run snapshot kNN at t_target
AttributeSpecification
Input(entity_id, t1, t2)
OutputDriftReport { magnitude, direction_cosines, affected_dimensions, rate_per_unit_time }
MetricsCosine distance, L2 distance, angular displacement
AttributeSpecification
Input(entity_a, entity_b, time_range)
OutputPairwise distance time series + detected divergence points (via PELT on the distance series)
  • Manual trigger for compaction: POST /v1/admin/compact
  • View tier statistics: GET /v1/admin/stats
  • Configure tier thresholds at runtime: PUT /v1/admin/config
  • Rebuild index from storage: POST /v1/admin/reindex
  • Get index statistics (node count, edge count, layer distribution): GET /v1/admin/index/stats
  • Health check: GET /v1/health{ status: "ok", uptime, version }
  • Readiness probe: GET /v1/ready → 200 when WAL recovery complete and index loaded
AttributeSpecification
ProtocolgRPC server-streaming
InputWatchRequest { entity_filter: Option<Vec<u64>>, min_severity: f64 }
OutputStream of DriftEvent { entity_id, timestamp, severity, drift_vector }
BehaviorClient subscribes; server pushes events as BOCPD detects them

MetricTargetMeasurement Method
Ingest throughput≥ 50K vectors/sec (D=768, single node)Benchmark with synthetic stream
Snapshot kNN latency p50< 2ms (1M vectors)Benchmark with random queries
Snapshot kNN latency p99< 10ms (1M vectors)Same as above
Trajectory retrieval< 1ms + 0.1ms/pointBenchmark with varying trajectory lengths
Prediction latency< 50ms (single entity)Benchmark with trained Neural ODE
CPD (PELT) offline< 1s for 100K-point trajectoryBenchmark on synthetic + real data
Cold start (empty → serving)< 5s (1M vectors pre-loaded)Measure from process start to first query
DimensionPhase 1-4 TargetPhase 5 Target
Total vectors100M (single node)10B (distributed)
Entities10M1B
Vector dimensionsUp to 4096Same
Concurrent queries1000 QPS10,000 QPS (cluster)
Concurrent ingest streams1001000 (cluster)
  • Durability: WAL fsync before ack. Crash recovery replays WAL from last committed offset.
  • Consistency model: Single-node: linearizable (single writer to index + storage). Distributed: eventual consistency for reads from followers; linearizable for writes via Raft leader.
  • Data loss window: Zero (WAL is authoritative).
  • Single node: Process crash → automatic restart via supervisor (systemd). Recovery time < cold start time.
  • Distributed (Phase 5): Raft-based replication. Tolerate 1 node failure per shard (3-replica minimum).
  • Configuration: Single TOML file + environment variable overrides.
  • Observability: Prometheus metrics endpoint, OpenTelemetry traces, structured JSON logging.
  • Upgrade: Graceful shutdown drains in-flight requests. Storage format versioned for backward compatibility.
  • Backup: Hot backup via RocksDB checkpoint. Cold tier already on object store (inherently backed up).

TypeStorage Size (D=768)Use Case
FP323072 bytesDefault. Full precision for hot tier.
FP161536 bytesWarm tier option for 2x compression with <1% recall loss.
INT8 (Scalar Quantized)768 bytes4x compression. Good for large-scale with moderate accuracy needs.
PQ (Product Quantized)8-64 bytes (configurable)Cold tier. 50-400x compression. Lossy.

Metadata is schemaless (arbitrary string-to-value map). Common expected fields:

{
"source": "model_v3",
"domain": "medical",
"language": "en",
"confidence": 0.95,
"tags": ["cardiology", "ecg"]
}

Metadata is stored but not indexed in Phase 1-4. Metadata-based filtering (e.g., “kNN where domain=medical”) is deferred.

  • Timestamps are i64 representing microseconds since Unix epoch.
  • Negative timestamps are valid (pre-1970 data for historical corpora).
  • Timestamp resolution: microsecond. Sub-microsecond events at the same entity_id are rejected (collision).
  • Timestamps must be monotonically increasing per entity. Out-of-order ingestion returns error with option to force (overwrite).

MethodPathDescriptionRequest BodyResponse
POST/v1/ingestBatch insert{ points: [TemporalPoint] }{ receipts: [WriteReceipt] }
POST/v1/queryExecute queryQueryRequestQueryResponse
GET/v1/entities/{id}Entity timeline infoEntityTimeline
GET/v1/entities/{id}/trajectoryFetch trajectory?t1=&t2=[TemporalPoint]
GET/v1/entities/{id}/velocityVector velocity?t={ velocity: [f32], magnitude: f32 }
GET/v1/entities/{id}/changepointsList change points?t1=&t2=&method=[ChangePoint]
GET/v1/healthHealth check{ status, uptime, version }
GET/v1/readyReadiness probe200 or 503
POST/v1/admin/compactTrigger compaction{ tier: "hot_to_warm" }{ status: "started" }
GET/v1/admin/statsSystem statisticsSystemStats
service ChronosVector {
rpc IngestStream (stream TemporalPoint) returns (stream WriteReceipt);
rpc Query (QueryRequest) returns (QueryResponse);
rpc QueryStream (QueryRequest) returns (stream ScoredResult);
rpc WatchDrift (WatchRequest) returns (stream DriftEvent);
}
HTTPgRPCMeaning
400INVALID_ARGUMENTMalformed request, dimension mismatch, invalid timestamp
404NOT_FOUNDEntity not found
409ALREADY_EXISTSDuplicate (entity_id, timestamp) with different vector hash
422FAILED_PRECONDITIONInsufficient data for requested operation (e.g., prediction with <5 points)
429RESOURCE_EXHAUSTEDRate limit exceeded
500INTERNALUnexpected error
503UNAVAILABLENot ready (WAL recovery in progress, index loading)

  1. Single developer initially. Architecture must be implementable incrementally by one person.
  2. No cloud vendor lock-in. Object store interface via object_store crate abstracts S3/GCS/Azure/local.
  3. Rust stable only (no nightly). Except for explicit SIMD if std::simd stabilizes.
  4. No Python dependency in runtime. burn and candle are pure Rust. No PyTorch/ONNX runtime.
  1. Embedding dimensions are fixed per entity after first insertion. Changing dimensions requires re-ingestion.
  2. Timestamps are provider-assigned (not server-assigned). The system trusts the producer’s clock.
  3. Vectors are normalized or unnormalized depending on the metric. CVX does not auto-normalize.
  4. The Neural ODE model (fθf_\theta) is trained offline and loaded at startup. Online training is out of scope for Phase 1-4.

CriterionEvidence
Demonstrates advanced Rust proficiencyUnsafe SIMD kernels, async concurrency, trait-based architecture, zero-copy serialization
Solves a real, unmet needNo existing VDB treats time as a geometric dimension with drift analysis
Publishable as technical workSufficient novelty for a systems paper (ST-HNSW + delta encoding + Neural ODE integration)
Usable by othersClean API, documentation, runnable benchmarks
MilestoneDefinition of Done
M1: First kNN queryIngest 1M vectors, execute snapshot kNN, recall ≥ 90%
M2: Temporal queries workRange kNN, trajectory retrieval, velocity computation all passing integration tests
M3: Delta encoding saves storageMeasurable ≥ 3x storage reduction on real embedding dataset (e.g., Wikipedia temporal)
M4: PELT detects known change pointsOn synthetic data with planted change points, F1 ≥ 0.85
M5: Neural ODE predictsOn held-out trajectory data, prediction error < linear extrapolation baseline
M6: API is production-readyREST + gRPC serving, health checks, graceful shutdown, structured logging

RiskProbabilityImpactMitigation
ST-HNSW composite distance hurts recallMediumHighBenchmark α=1.0 (pure semantic) against vanilla HNSW as baseline. If recall drops >5%, revisit ADR-002.
Delta encoding reconstruction too slowLowMediumTune keyframe interval K. Worst case: disable deltas and store full vectors.
Neural ODE training data insufficient for useful predictionsMediumLowLinear extrapolation as fallback is always available. Neural ODE is a bonus, not a requirement.
RocksDB write amplification causes SSD wearLowMediumMonitor write amplification ratio. Tune compaction. Use leveled compaction for read-heavy workloads.
Scope creep into distributed mode too earlyHighHighStrict phase gating. No distributed code until single-node milestones M1-M6 are met.