RFC-007: Advanced Temporal Primitives

See full RFC: design/CVX_RFC_007_Advanced_Temporal_Primitives.md

Summary

Five new temporal primitives that exploit CVX’s unique structural advantage: co-located temporal, entity, and vector data within a single index. These operations are impossible or prohibitively expensive in generic vector databases because they require cross-entity temporal reasoning.

Primitive	What it answers
Temporal Join	”When were entities A and B semantically close simultaneously?”
Granger Causality	”Do A’s movements in embedding space predict B’s?”
Temporal Motifs	”Does entity A exhibit recurring semantic patterns?”
Cohort Drift	”How did a group of entities evolve collectively?”
Counterfactual Trajectories	”What would have happened if the trajectory diverged at change point C?”

Why

Current CVX analytics operate on single entities in isolation. Real-world questions are inherently multi-entity and temporal:

Clinical NLP: “Did the patient’s language shift before or after their social network’s?” (contagion vs. isolation)
Finance: “Do sector leaders’ embeddings Granger-cause followers?” (information flow)
Social media: “How did the collective discourse of a community change after event X?” (cohort drift)

In Qdrant, cross-entity temporal analysis requires N+1 queries, client-side alignment, and external statistics. CVX’s entity_index enables O(log N) server-side cross-entity operations under a single read lock.

Key Algorithms

Temporal Join: Merge-scan of sorted trajectories + sliding window distance
Granger Causality: VAR(L) fitting on dimensionality-reduced trajectories (region distributions or anchor projections) + Fisher’s method for combining per-dimension p-values
Temporal Motifs: Matrix Profile (STOMP) on region trajectories — O(N²×K) where K≈80
Cohort Drift: Centroid drift + dispersion change + convergence score + outlier detection
Counterfactual: Linear extrapolation from pre-change trajectory + divergence curve integration

Implementation Plan

5 phases, each a separate feature branch:

feat/cohort-drift (P0) — cohort-level drift with all statistics — 20 tests
feat/temporal-join (P0) — pairwise and group convergence windows — 14 tests
feat/temporal-motifs (P1) — Matrix Profile motif and discord discovery — 15 tests
feat/granger-embeddings (P1) — VAR fitting + F-test + direction detection — 13 tests
feat/counterfactual (P2) — linear extrapolation + Neural ODE (later) — 11 tests

Status: All 5 phases implemented and merged. 73 total tests.

Real-World Validation (eRisk, 466 users)

Feature	Result	Notebook
Cohort Drift	Depression convergence score 4× higher than control (0.207 vs 0.051)	B4
Temporal Motifs	Depressed users show 61% more motif occurrences (33 vs 20.5)	B5
Granger Causality	Synthetic Trump→Market: bidirectional, F=398.5, p<1e-6, lag=1 day	B5
Temporal Join	Needs per-domain epsilon tuning — anchor space distances are small	B4
Counterfactual	Requires lower PELT penalty for low-dimensional projections (K=10)	B4