Unified Theory: Temporal Vector Analytics for Intelligent Memory
1. The Core Thesis
Section titled “1. The Core Thesis”Time converts vectors into trajectories. Trajectories have mathematical structure that static vectors cannot capture. Exploiting this structure produces better decisions than similarity-based retrieval alone.
ChronosVector (CVX) implements this thesis across six layers, from raw storage to probabilistic reasoning. Each layer builds on the previous, and no layer alone is sufficient.
2. Layer 0: Vectors in Time
Section titled “2. Layer 0: Vectors in Time”The Fundamental Object
Section titled “The Fundamental Object”A temporal point captures an entity observed at time with embedding . A trajectory is a time-ordered sequence:
Standard vector databases store only — they discard and . CVX preserves all three, enabling every subsequent layer.
Storage
Section titled “Storage”- HNSW index: approximate nearest neighbor search with SIMD distance kernels
- Temporal filtering: RoaringBitmap pre-filter by time range (< 1 byte per vector)
- Episode encoding:
entity_id = episode_id × 10000 + step_indexgroups steps into episodes
The Anisotropy Problem
Section titled “The Anisotropy Problem”Modern embedding models produce vectors in a narrow cone — all pairwise cosine similarities . Centering (subtracting the global mean ) amplifies the discriminative signal 30×:
This is not optional — without centering, all downstream analytics operate on noise (Ethayarajh, EMNLP 2019; Su et al., ACL 2021).
3. Layer 1: Differential Calculus on Trajectories
Section titled “3. Layer 1: Differential Calculus on Trajectories”Treating trajectories as differentiable curves in enables kinematic analysis:
Velocity
Section titled “Velocity”
The magnitude measures rate of semantic change. High velocity = rapid behavioral shift.
Cumulative displacement from an initial state. When projected onto anchor vectors, drift measures proximity to domain-specific concepts (e.g., DSM-5 symptoms).
Hurst Exponent
Section titled “Hurst Exponent”
: trajectory is persistent (trends continue). : anti-persistent (trends reverse). : random walk.
Change Point Detection (PELT)
Section titled “Change Point Detection (PELT)”
Detects moments when the statistical properties of the trajectory change abruptly — regime transitions, onset events, behavioral shifts (Killick et al., 2012).
4. Layer 2: Algebraic Invariants
Section titled “4. Layer 2: Algebraic Invariants”Path Signatures (Lyons, 1998)
Section titled “Path Signatures (Lyons, 1998)”The depth- truncated signature of a path is:
This is a universal, reparametrization-invariant descriptor: two trajectories with the same shape (regardless of speed) produce similar signatures. At depth 2, the features capture both displacement and signed area (rotation/order).
Topological Features
Section titled “Topological Features”Persistent homology tracks the connectivity structure of trajectory point clouds:
The persistence diagram encodes the birth/death of topological features across scales — more robust than single-scale clustering.
5. Layer 3: Distributional Geometry
Section titled “5. Layer 3: Distributional Geometry”When trajectories are projected onto regions or anchors, they become probability distributions over discrete states. The geometry of distributions requires specialized metrics:
Fisher-Rao Distance
Section titled “Fisher-Rao Distance”The geodesic on the statistical manifold of categorical distributions:
Unlike KL divergence, this is a true metric (symmetric, triangle inequality). Range: .
Wasserstein (Earth Mover’s) Distance
Section titled “Wasserstein (Earth Mover’s) Distance”The optimal transport cost, respecting the geometry of the underlying space:
Unlike Fisher-Rao, Wasserstein accounts for how far apart the categories are — moving mass between nearby regions costs less than between distant ones.
6. Layer 4: Temporal Causality
Section titled “6. Layer 4: Temporal Causality”Temporal Edges
Section titled “Temporal Edges”Each entity’s trajectory has an intrinsic order — step precedes step . Temporal edges (TemporalEdgeLayer) encode this:
successor(node)→ what happened next for this entitypredecessor(node)→ what happened beforecausal_search(query, k, temporal_context)→ find similar states + walk forward/backward
This enables the continuation pattern: “find where someone was in my situation, show me what they did next.”
Typed Edges (RFC-013)
Section titled “Typed Edges (RFC-013)”Beyond temporal succession, typed edges encode relational structure:
| Edge Type | Meaning | Example |
|---|---|---|
CausedSuccess | This action contributed to a win | retrieve + follow → win → edge |
CausedFailure | This action was present during failure | retrieve + follow → fail → edge |
SameActionType | Same abstract action in different contexts | ”navigate” in episode A ↔ “navigate” in episode B |
RegionTransition | Observed movement between semantic clusters | Region 5 → Region 12 with probability 0.7 |
Granger Causality
Section titled “Granger Causality”For cross-entity causal discovery:
“Does entity A’s trajectory history improve prediction of entity B’s future?” — implemented as F-test on lagged regression residuals.
7. Layer 5: Probabilistic Reasoning
Section titled “7. Layer 5: Probabilistic Reasoning”Region MDP (RFC-013 Part A)
Section titled “Region MDP (RFC-013 Part A)”HNSW regions define a discrete state space. Observed trajectories define transitions. The result is a Markov Decision Process:
This answers: “in states like mine, which action type has the highest success probability?”
Bayesian Network (cvx-bayes)
Section titled “Bayesian Network (cvx-bayes)”When variables have conditional dependencies that a linear scorer cannot capture:
The network factorizes the joint distribution via the DAG structure:
Each CPT is learned from observations with Laplace smoothing. Inference computes posteriors via variable elimination.
Context-Aware Decay
Section titled “Context-Aware Decay”The discovery from E7c/E7d/E7e: blind reward decay destroys useful experts. Context-aware decay only penalizes experts when:
- Task type matches the failed game
- The agent actually followed the expert’s action
- The expert is in a low-quality region (informed by MDP)
8. Layer 6: Structural Knowledge
Section titled “8. Layer 6: Structural Knowledge”Knowledge Graph (cvx-graph)
Section titled “Knowledge Graph (cvx-graph)”Encodes compositional structure that neither vectors nor probabilities capture:
- Task plans:
heat_then_placerequires find → take → heat → take → put - Shared sub-plans:
heatandcleanboth start with find → take - Constraints: after
take, valid next actions arego/use/put, nottakeagain - Transfer: if I know how to
find → takefor cleaning, I can reuse it for heating
The graph enables structural guidance during retrieval: the agent knows what step comes next from the graph, and uses CVX to find the best concrete realization.
9. The Integrated System
Section titled “9. The Integrated System”The six layers compose into a closed-loop active memory:
OBSERVATION → embed → HNSW search → candidates ↓ Bayesian scoring (Layer 5) ├── Similarity (Layer 0) ├── Recency (Layer 1) ├── Reward (Layer 4 — typed edges) ├── P(success | context) (Layer 5 — BN/MDP) └── Task plan step (Layer 6 — KG) ↓ LLM chooses action ↓ OUTCOME → update ├── Win → add to index (Layer 0) ├── Win → add CausedSuccess edges (Layer 4) ├── Fail → context-aware decay (Layer 5) ├── Update MDP transitions (Layer 5) ├── Update BN posteriors (Layer 5) └── Update KG if new structure learned (Layer 6)Why No Layer Alone Is Sufficient
Section titled “Why No Layer Alone Is Sufficient”| Layer | What it provides | What it cannot do |
|---|---|---|
| 0 (HNSW) | Find similar states | Distinguish success from failure |
| 1 (Calculus) | Measure change speed | Predict next action |
| 2 (Signatures) | Compare trajectory shapes | Reason about task structure |
| 3 (Distributions) | Compare population-level patterns | Make individual decisions |
| 4 (Causality) | Attribute outcomes to actions | Estimate probabilities |
| 5 (Bayesian) | Compute conditional probabilities | Represent compositional knowledge |
| 6 (Knowledge) | Encode task structure | Score candidates numerically |
Each layer addresses a specific limitation of the layers below it. The full system is more than the sum of its parts.
10. Empirical Validation
Section titled “10. Empirical Validation”| Experiment | What it tested | Result |
|---|---|---|
| B2 (clinical anchoring) | Layer 0+1: centered drift detection | F1=0.744 on eRisk depression |
| B8 (ParlaMint) | Layer 0+3: rhetorical profiling | F1=0.94 predicting speaker gender |
| E5 (ALFWorld GPT-4o) | Layer 0+4: causal retrieval | 20% → 43.3% task completion |
| E7b (online learning) | Layer 0+4+5: reward decay | 6.7% → 26.7% across 3 rounds |
| E7e (context-aware) | Layer 5: conditional decay | Peak 30%, plateau 19.5% (vs 14.8%) |
11. References
Section titled “11. References”Temporal Embeddings & Anisotropy
Section titled “Temporal Embeddings & Anisotropy”- Ethayarajh (2019). “How Contextual are Contextualized Word Representations?” EMNLP
- Su et al. (2021). “Whitening Sentence Representations for Better Semantics and Faster Retrieval.” ACL
Path Signatures & Topology
Section titled “Path Signatures & Topology”- Lyons (1998). Differential Equations Driven by Rough Signals
- Carlsson (2009). “Topology and Data.” Bulletin of the AMS
Change Point Detection
Section titled “Change Point Detection”- Killick et al. (2012). “Optimal detection of changepoints with a linear computational cost.” JASA
Information Geometry
Section titled “Information Geometry”- Amari & Nagaoka (2000). Methods of Information Geometry
Bayesian Networks
Section titled “Bayesian Networks”- Pearl (1988). Probabilistic Reasoning in Intelligent Systems
- Koller & Friedman (2009). Probabilistic Graphical Models
Knowledge Graphs
Section titled “Knowledge Graphs”- Hogan et al. (2021). “Knowledge Graphs.” ACM Computing Surveys
Agent Memory
Section titled “Agent Memory”- Park et al. (2023). “Generative Agents.” UIST
- Shinn et al. (2023). “Reflexion.” NeurIPS
- Chen et al. (2021). “Decision Transformer.” NeurIPS
- Hafner et al. (2023). “DreamerV3.” arXiv
Causal Inference
Section titled “Causal Inference”- Bareinboim et al. (2022-2024). Causal Reinforcement Learning
Optimal Transport
Section titled “Optimal Transport”- Villani (2008). Optimal Transport: Old and New