Vision & Motivation
Beyond Static Vectors
Section titled “Beyond Static Vectors”Standard vector databases treat embeddings as frozen snapshots. But in every domain where embeddings matter — NLP, clinical research, political analysis, AI agents — the interesting signal is in how vectors change over time.
CVX was built around one insight: time is not a filter, it’s a dimension.
Three Motivating Problems
Section titled “Three Motivating Problems”1. Clinical Signal in Temporal Drift
Section titled “1. Clinical Signal in Temporal Drift”A single social media post tells you almost nothing about a user’s mental state. But the trajectory of their language over weeks — gradual vocabulary shifts, increasing negativity, changing topic patterns — reveals clinically meaningful signals.
CVX enables this by projecting user trajectories onto DSM-5 symptom anchors (depressed mood, anhedonia, worthlessness, etc.) and tracking proximity over time. On the eRisk dataset (1.36M posts, 2,285 users), anchor-projected temporal features achieve F1=0.744 — a 24% improvement over static embeddings.
2. Episodic Memory for AI Agents
Section titled “2. Episodic Memory for AI Agents”Standard RAG gives LLM agents access to facts — but not to how a problem was solved before. This maps to the cognitive science distinction between semantic memory (what you know) and episodic memory (what you’ve done and how it went).
CVX provides the temporal structure that standard vector stores lack:
- Episode identity:
entity_id = episode_id << 16 | step_index— groups steps into episodes - Temporal ordering: Timestamps define step order within episodes
- Causal continuation: “Find where someone was in my situation, show me what they did next”
Validated experimentally: an ALFWorld agent with CVX episodic memory achieves 6× improvement in task completion over zero-shot (3.3% → 20.0%).
3. Rhetorical Evolution in Political Discourse
Section titled “3. Rhetorical Evolution in Political Discourse”Parliamentary speech evolves over time — influenced by elections, crises, and social movements. CVX tracks rhetorical trajectories of individual MPs and political groups, using anchor projection onto rhetorical dimensions (emotional appeals, policy topics, personal attacks).
On ParlaMint-ES (32K speeches, 841 MPs), CVX rhetorical profiling predicts speaker gender with F1=0.94 and reveals that party affiliation drives rhetorical similarity more than gender.
The Scientific Foundation
Section titled “The Scientific Foundation”Embedding Anisotropy
Section titled “Embedding Anisotropy”Modern contextual embeddings occupy a narrow cone in high-dimensional space (Ethayarajh 2019). All vectors share a dominant “average text” direction, compressing the discriminative signal into a small residual. CVX addresses this natively via mean-centering.
Path Signatures
Section titled “Path Signatures”From rough path theory (Lyons 1998), path signatures provide universal, reparametrization-invariant descriptors of trajectories. CVX implements truncated path signatures as trajectory fingerprints — two trajectories with the same shape (regardless of speed) will have similar signatures.
Change Point Detection
Section titled “Change Point Detection”PELT (Killick et al. 2012) provides exact offline change point detection in O(N). BOCPD (Adams & MacKay 2007) provides online streaming detection in O(1) amortized. CVX implements both for identifying regime transitions in embedding trajectories.
Temporal Vector Calculus
Section titled “Temporal Vector Calculus”CVX computes velocity (), acceleration (), drift magnitude, and Hurst exponents on embedding trajectories — treating vector time series as differentiable curves in high-dimensional space.
Design Principles
Section titled “Design Principles”| Principle | Description |
|---|---|
| Time as Geometry | Time is a dimension of the search space, not a filter |
| Vectors as Trajectories | Entities are curves in embedding space, not points |
| Anchors as Coordinates | Domain-specific reference vectors create interpretable projections |
| Episodes as Sequences | Temporal structure enables “what happened next?” queries |
| Analytics over Storage | The value is in the 27+ temporal operations, not just kNN |