Episodic Memory for AI Agents
Semantic vs Episodic Memory
Section titled “Semantic vs Episodic Memory”Standard RAG gives agents access to facts (semantic memory). CVX provides episodic memory — the ability to recall how a problem was solved before, step by step.
| Capability | Standard Vector Store | CVX |
|---|---|---|
| Find similar states | Cosine search | Cosine + temporal distance |
| Episode identity | Not native | entity_id = episode_id * 10000 + step |
| ”What happened next?” | Not possible | causal_search() — successor edges |
| Filter by success | Post-hoc | search_with_reward(min_reward=0.5) — bitmap pre-filter |
Episode Data Model
Section titled “Episode Data Model”Each step is a TemporalPoint with encoded episode identity:
import chronos_vector as cvximport numpy as npnp.random.seed(42)
index = cvx.TemporalIndex(m=16, ef_construction=100)
# 20 episodes, 5 steps each, 60% success rateepisodes = []for ep in range(20): success = np.random.random() > 0.4 reward = 1.0 if success else 0.0 state = np.random.randn(32).astype(np.float32) * 0.5 nodes = [] for step in range(5): state = state + np.random.randn(32).astype(np.float32) * 0.1 eid = ep * 10000 + step nid = index.insert(eid, eid, state.tolist()) nodes.append(nid) for nid in nodes: index.set_reward(nid, reward) episodes.append({'id': ep, 'success': success, 'nodes': nodes})
successes = sum(1 for e in episodes if e['success'])print(f"{len(episodes)} episodes ({successes} successful), {len(index)} points")20 episodes (13 successful), 100 pointsReward-Filtered Search
Section titled “Reward-Filtered Search”Only retrieve experiences that worked:
query = np.random.randn(32).astype(np.float32) * 0.5results = index.search_with_reward(query.tolist(), k=3, min_reward=0.5)for eid, ts, score in results: ep_id, step = eid // 10000, eid % 10000 print(f" Episode {ep_id} step {step}: score={score:.3f}") Episode 12 step 3: score=12.441 Episode 7 step 2: score=13.205 Episode 15 step 4: score=14.112💡 Bitmap pre-filtering
search_with_rewarduses a RoaringBitmap to pre-filter nodes during HNSW traversal — not post-filtering. This is O(1) per node, same cost as temporal filtering.
Causal Search: “What Happened Next?”
Section titled “Causal Search: “What Happened Next?””The most powerful pattern — find similar states and return the continuation:
results = index.causal_search( vector=query.tolist(), k=3, temporal_context=3, # walk 3 steps forward/backward)for r in results: ep_id = r['entity_id'] // 10000 step = r['entity_id'] % 10000 print(f"Match: Episode {ep_id} step {step} (score={r['score']:.3f})") print(f" {len(r['successors'])} successors, {len(r['predecessors'])} predecessors")Match: Episode 12 step 3 (score=11.283) 1 successors, 3 predecessorsMatch: Episode 7 step 2 (score=12.910) 2 successors, 2 predecessorsMatch: Episode 15 step 4 (score=13.557) 0 successors, 3 predecessorsEach match shows the step that was found (0) plus the steps that came before (-) and after (+) in the same episode.
Hybrid Search
Section titled “Hybrid Search”Beam search exploring both semantic neighbors (HNSW graph edges) and temporal neighbors (predecessor/successor edges):
With : pure semantic. With : aggressive temporal exploration.
results = index.hybrid_search(query.tolist(), k=5, beta=0.3)for eid, ts, score in results: print(f" Episode {eid//10000} step {eid%10000}: score={score:.3f}")Agent Loop (Pseudocode)
Section titled “Agent Loop (Pseudocode)”def agent_step(observation, index, embed_fn, llm_fn): embedding = embed_fn(observation)
# Find successful past states similar to mine memories = index.causal_search( vector=embedding, k=3, temporal_context=5 )
# Extract continuations as LLM context context = [{ "similarity": m["score"], "next_steps": [vec for _, _, vec in m["successors"]], } for m in memories]
# LLM decides action action = llm_fn(observation, context) result, reward = env.step(action)
# Store experience for future retrieval node_id = index.insert( entity_id=current_episode * 10000 + current_step, timestamp=current_episode * 10000 + current_step, vector=embedding, reward=reward, ) return action, resultValidated Results
Section titled “Validated Results”| Experiment | Task | Baseline | CVX Memory |
|---|---|---|---|
| E1 | Code generation (HumanEval) | — | 77.8% pass@1 |
| E3 | Interactive RL (ALFWorld) | 3.3% | 20.0% (6x) |
| E4 | Iterative debugging (APPS) | 28.0% | 31.0% |