RFC-006: Anchor Projection
This RFC proposes Anchor Projection — a coordinate system transformation that re-expresses trajectories relative to user-defined reference points (anchors), turning opaque high-dimensional drift into interpretable, clinically meaningful signals.
Problem
Section titled “Problem”All CVX analytics operate in absolute embedding space (, or ). High-dimensional embeddings mask clinically meaningful drift:
| Issue | Impact |
|---|---|
| Curse of dimensionality | Distances concentrate — all pairs look equally far |
| No interpretability | ”dim_127 changed by 0.3” conveys nothing clinical |
| Domain-agnostic axes | Embedding dimensions have no semantic meaning |
A user drifting toward depressive language and a user drifting toward sports talk may produce identical velocity magnitudes. Without a reference frame, the direction of drift is unreadable.
Key Insight
Section titled “Key Insight”The projected trajectory is a trajectory — so velocity(), hurst_exponent(), detect_changepoints(), path_signature() all work on it natively. Anchor Projection is a coordinate system change, not a new analytics paradigm. It composes with the entire existing CVX toolkit.
Solution
Section titled “Solution”project_to_anchors(trajectory, anchors, metric) projects from where = number of anchors. Each dimension of the output equals the distance (cosine or L2) from the trajectory point to anchor :
The output is a new trajectory of the same length but in , which feeds directly into all existing CVX functions.
Example: With 3 anchors (depression centroid, anxiety centroid, neutral centroid), a 50-post trajectory in becomes a 50-step trajectory in where each dimension is the distance to a clinically interpretable reference point.
New API
Section titled “New API”# Project trajectory into anchor-relative coordinatesprojected = cvx.project_to_anchors( trajectory, # (T, D) array — original trajectory anchors, # (K, D) array — reference embeddings metric='cosine' # 'cosine' | 'l2')# Returns: (T, K) array — trajectory in ℝᴷ
# Summarize per-anchor dynamicssummary = cvx.anchor_summary(projected)# Returns: dict per anchor with {mean, min, trend, last}Composition with existing analytics
Section titled “Composition with existing analytics”# All existing CVX functions work on the projected trajectory:vel = cvx.velocity(projected, timestamps) # velocity in ℝᴷcps = cvx.detect_changepoints("user", projected) # regime changes in anchor spaceH = cvx.hurst_exponent(projected) # persistence of anchor-relative driftsig = cvx.path_signature(projected, depth=2) # signature over anchor distancesImplementation
Section titled “Implementation”- Module:
cvx-analytics::anchor - Dependencies: Uses existing
drift_magnitude_cosine/drift_magnitude_l2fromcvx-analytics::drift - Core logic: For each time step and anchor , compute the chosen distance metric between and . Assemble into a matrix.
- Complexity: — linear in all dimensions, no graph lookups needed
Clinical Validation
Section titled “Clinical Validation”On the eRisk 2018 depression detection task, anchor-relative features substantially improved classification:
| Configuration | Precision | Recall | F1 | AUC |
|---|---|---|---|---|
| B1 baseline (absolute space) | 0.667 | 0.545 | 0.600 | 0.639 |
| B2 combined (with anchor projection) | 0.714 | 0.857 | 0.781 | 0.863 |
| Improvement | +0.047 | +0.312 | +0.181 | +0.224 |
Anchor projection makes drift toward or away from known clinical poles directly measurable, providing the reference frame that raw embeddings lack.
Phases
Section titled “Phases”| Phase | Scope | Effort |
|---|---|---|
| 1 | project_to_anchors in Rust (cvx-analytics::anchor) | Low |
| 2 | anchor_summary aggregation | Low |
| 3 | Python bindings via PyO3 | Low |
| 4 | Integration tests with eRisk data | Medium |
| 5 | Tutorial B2 with anchor-projected analytics | Medium |