Skip to content

RFC-009: LLM Tool Integration

See full RFC: design/CVX_RFC_009_LLM_Tool_Integration.md

Transform CVX into a first-class tool for Large Language Models. LLMs can reason about time, entities, and change — but they cannot compute temporal analytics. CVX provides the computational substrate that turns an LLM from a static knowledge system into a temporal reasoning engine.

LayerWhatWho uses it
MCP ServerModel Context Protocol adapter (JSON-RPC over stdio)Claude Code, Claude Desktop, any MCP client
LLM-Optimized APIEndpoints returning structured, interpretable summariesAny LLM via function calling / tool use
Inline EmbeddingAccept text, embed internally — no vector passingReduces tool calls from 3 to 1
Agentic WorkflowsMulti-step reasoning patterns combining CVX toolsAutonomous monitoring agents
LLMs excel atLLMs cannot
Understanding descriptions of changeCompute whether drift actually occurred
Reasoning about causalityDetect change points in 768-dim trajectories
Generating hypothesesQuantify drift magnitude or direction
Natural language synthesisCompare trajectories across entities

Standard RAG answers “what documents are similar?” Temporal RAG (CVX) answers “how has the discourse changed since then?”

  • Anthropic-native: Claude Code and Claude Desktop support MCP
  • Zero HTTP overhead: direct stdio communication
  • Tool discovery: LLM sees all tools with schemas at conversation start
  • Streaming: results stream back as computed

8 tools designed for LLM consumption:

ToolPurpose
cvx_searchTemporal RAG — find similar content within time windows
cvx_entity_summaryHigh-level temporal overview of an entity
cvx_drift_reportQuantify semantic change with anchor-projected interpretation
cvx_detect_anomaliesProactive scan for unusual changes across entities
cvx_compare_entitiesCross-entity temporal analysis (convergence, Granger, correlation)
cvx_cohort_analysisGroup-level drift, convergence, and treatment effect evaluation
cvx_forecastTrajectory prediction with interpretable output
cvx_ingestAdd new temporal data (accepts text or vectors)

Design Principle: Structured Narratives, Not Raw Data

Section titled “Design Principle: Structured Narratives, Not Raw Data”

Tools return interpretable summaries that fit in a few hundred tokens:

{
"drift": {
"l2_magnitude": 0.87,
"percentile": 92,
"interpretation": "Drifted more than 92% of entities"
},
"anchor_projection": {
"depression": {"direction": "closer", "change": -0.15},
"recovery": {"direction": "farther", "change": +0.22}
},
"suggested_next": [
{"tool": "cvx_detect_anomalies", "description": "Check for change points"}
]
}

The suggested_next field guides the LLM’s reasoning chain.

4 documented multi-step patterns:

  1. Longitudinal Monitor — periodic anomaly scan → investigate → report
  2. Comparative Investigation — entity summaries → cross-entity analysis → explain relationships
  3. Cohort Treatment Evaluation — cohort drift → outlier identification → forecast
  4. Temporal RAG Chain — search now → search then → drift report → synthesize narrative

4 phases:

  1. feat/mcp-server (P0) — new cvx-mcp crate, JSON-RPC over stdio, 8 tool definitions — 14 tests
  2. feat/llm-api-layer (P0) — composite endpoints (entity_summary, anomaly_scan) — 2 REST endpoints
  3. feat/inline-embeddings (P1) — Embedder trait, MockEmbedder, ONNX/API stubs — 9 tests
  4. feat/agentic-patterns (P2) — workflow documentation, LLM integration guide

Status: All 4 phases implemented and merged. 23 tests + 2 new endpoints + docs.

Python bindings for RFC-007 added (6 new PyO3 functions). Validated in notebooks B4 and B5.