RFC-009: LLM Tool Integration
See full RFC: design/CVX_RFC_009_LLM_Tool_Integration.md
Summary
Section titled “Summary”Transform CVX into a first-class tool for Large Language Models. LLMs can reason about time, entities, and change — but they cannot compute temporal analytics. CVX provides the computational substrate that turns an LLM from a static knowledge system into a temporal reasoning engine.
| Layer | What | Who uses it |
|---|---|---|
| MCP Server | Model Context Protocol adapter (JSON-RPC over stdio) | Claude Code, Claude Desktop, any MCP client |
| LLM-Optimized API | Endpoints returning structured, interpretable summaries | Any LLM via function calling / tool use |
| Inline Embedding | Accept text, embed internally — no vector passing | Reduces tool calls from 3 to 1 |
| Agentic Workflows | Multi-step reasoning patterns combining CVX tools | Autonomous monitoring agents |
The Temporal Reasoning Gap
Section titled “The Temporal Reasoning Gap”| LLMs excel at | LLMs cannot |
|---|---|
| Understanding descriptions of change | Compute whether drift actually occurred |
| Reasoning about causality | Detect change points in 768-dim trajectories |
| Generating hypotheses | Quantify drift magnitude or direction |
| Natural language synthesis | Compare trajectories across entities |
Standard RAG answers “what documents are similar?” Temporal RAG (CVX) answers “how has the discourse changed since then?”
Why MCP?
Section titled “Why MCP?”- Anthropic-native: Claude Code and Claude Desktop support MCP
- Zero HTTP overhead: direct stdio communication
- Tool discovery: LLM sees all tools with schemas at conversation start
- Streaming: results stream back as computed
Tool Definitions
Section titled “Tool Definitions”8 tools designed for LLM consumption:
| Tool | Purpose |
|---|---|
cvx_search | Temporal RAG — find similar content within time windows |
cvx_entity_summary | High-level temporal overview of an entity |
cvx_drift_report | Quantify semantic change with anchor-projected interpretation |
cvx_detect_anomalies | Proactive scan for unusual changes across entities |
cvx_compare_entities | Cross-entity temporal analysis (convergence, Granger, correlation) |
cvx_cohort_analysis | Group-level drift, convergence, and treatment effect evaluation |
cvx_forecast | Trajectory prediction with interpretable output |
cvx_ingest | Add new temporal data (accepts text or vectors) |
Design Principle: Structured Narratives, Not Raw Data
Section titled “Design Principle: Structured Narratives, Not Raw Data”Tools return interpretable summaries that fit in a few hundred tokens:
{ "drift": { "l2_magnitude": 0.87, "percentile": 92, "interpretation": "Drifted more than 92% of entities" }, "anchor_projection": { "depression": {"direction": "closer", "change": -0.15}, "recovery": {"direction": "farther", "change": +0.22} }, "suggested_next": [ {"tool": "cvx_detect_anomalies", "description": "Check for change points"} ]}The suggested_next field guides the LLM’s reasoning chain.
Agentic Workflow Patterns
Section titled “Agentic Workflow Patterns”4 documented multi-step patterns:
- Longitudinal Monitor — periodic anomaly scan → investigate → report
- Comparative Investigation — entity summaries → cross-entity analysis → explain relationships
- Cohort Treatment Evaluation — cohort drift → outlier identification → forecast
- Temporal RAG Chain — search now → search then → drift report → synthesize narrative
Implementation Plan
Section titled “Implementation Plan”4 phases:
feat/mcp-server(P0) — newcvx-mcpcrate, JSON-RPC over stdio, 8 tool definitions — 14 testsfeat/llm-api-layer(P0) — composite endpoints (entity_summary,anomaly_scan) — 2 REST endpointsfeat/inline-embeddings(P1) —Embeddertrait, MockEmbedder, ONNX/API stubs — 9 testsfeat/agentic-patterns(P2) — workflow documentation, LLM integration guide
Status: All 4 phases implemented and merged. 23 tests + 2 new endpoints + docs.
Python bindings for RFC-007 added (6 new PyO3 functions). Validated in notebooks B4 and B5.