RFC-007: Advanced Temporal Primitives
See full RFC: design/CVX_RFC_007_Advanced_Temporal_Primitives.md
Summary
Section titled “Summary”Five new temporal primitives that exploit CVX’s unique structural advantage: co-located temporal, entity, and vector data within a single index. These operations are impossible or prohibitively expensive in generic vector databases because they require cross-entity temporal reasoning.
| Primitive | What it answers |
|---|---|
| Temporal Join | ”When were entities A and B semantically close simultaneously?” |
| Granger Causality | ”Do A’s movements in embedding space predict B’s?” |
| Temporal Motifs | ”Does entity A exhibit recurring semantic patterns?” |
| Cohort Drift | ”How did a group of entities evolve collectively?” |
| Counterfactual Trajectories | ”What would have happened if the trajectory diverged at change point C?” |
Current CVX analytics operate on single entities in isolation. Real-world questions are inherently multi-entity and temporal:
- Clinical NLP: “Did the patient’s language shift before or after their social network’s?” (contagion vs. isolation)
- Finance: “Do sector leaders’ embeddings Granger-cause followers?” (information flow)
- Social media: “How did the collective discourse of a community change after event X?” (cohort drift)
In Qdrant, cross-entity temporal analysis requires N+1 queries, client-side alignment, and external statistics. CVX’s entity_index enables O(log N) server-side cross-entity operations under a single read lock.
Key Algorithms
Section titled “Key Algorithms”- Temporal Join: Merge-scan of sorted trajectories + sliding window distance
- Granger Causality: VAR(L) fitting on dimensionality-reduced trajectories (region distributions or anchor projections) + Fisher’s method for combining per-dimension p-values
- Temporal Motifs: Matrix Profile (STOMP) on region trajectories — O(N²×K) where K≈80
- Cohort Drift: Centroid drift + dispersion change + convergence score + outlier detection
- Counterfactual: Linear extrapolation from pre-change trajectory + divergence curve integration
Implementation Plan
Section titled “Implementation Plan”5 phases, each a separate feature branch:
feat/cohort-drift(P0) — cohort-level drift with all statistics — 20 testsfeat/temporal-join(P0) — pairwise and group convergence windows — 14 testsfeat/temporal-motifs(P1) — Matrix Profile motif and discord discovery — 15 testsfeat/granger-embeddings(P1) — VAR fitting + F-test + direction detection — 13 testsfeat/counterfactual(P2) — linear extrapolation + Neural ODE (later) — 11 tests
Status: All 5 phases implemented and merged. 73 total tests.
Real-World Validation (eRisk, 466 users)
Section titled “Real-World Validation (eRisk, 466 users)”| Feature | Result | Notebook |
|---|---|---|
| Cohort Drift | Depression convergence score 4× higher than control (0.207 vs 0.051) | B4 |
| Temporal Motifs | Depressed users show 61% more motif occurrences (33 vs 20.5) | B5 |
| Granger Causality | Synthetic Trump→Market: bidirectional, F=398.5, p<1e-6, lag=1 day | B5 |
| Temporal Join | Needs per-domain epsilon tuning — anchor space distances are small | B4 |
| Counterfactual | Requires lower PELT penalty for low-dimensional projections (K=10) | B4 |