Skip to content

RFC-006: Anchor Projection

This RFC proposes Anchor Projection — a coordinate system transformation that re-expresses trajectories relative to user-defined reference points (anchors), turning opaque high-dimensional drift into interpretable, clinically meaningful signals.

All CVX analytics operate in absolute embedding space (RD\mathbb{R}^D, D=384D = 384 or 768768). High-dimensional embeddings mask clinically meaningful drift:

IssueImpact
Curse of dimensionalityDistances concentrate — all pairs look equally far
No interpretability”dim_127 changed by 0.3” conveys nothing clinical
Domain-agnostic axesEmbedding dimensions have no semantic meaning

A user drifting toward depressive language and a user drifting toward sports talk may produce identical velocity magnitudes. Without a reference frame, the direction of drift is unreadable.

The projected trajectory is a trajectory — so velocity(), hurst_exponent(), detect_changepoints(), path_signature() all work on it natively. Anchor Projection is a coordinate system change, not a new analytics paradigm. It composes with the entire existing CVX toolkit.

project_to_anchors(trajectory, anchors, metric) projects from RDRK\mathbb{R}^D \to \mathbb{R}^K where KK = number of anchors. Each dimension kk of the output equals the distance (cosine or L2) from the trajectory point to anchor kk:

projectedt[k]=d(xt,ak),k=1,,K\text{projected}_t[k] = d(\mathbf{x}_t, \mathbf{a}_k), \quad k = 1, \ldots, K

The output is a new trajectory of the same length but in RK\mathbb{R}^K, which feeds directly into all existing CVX functions.

Example: With 3 anchors (depression centroid, anxiety centroid, neutral centroid), a 50-post trajectory in R384\mathbb{R}^{384} becomes a 50-step trajectory in R3\mathbb{R}^3 where each dimension is the distance to a clinically interpretable reference point.

# Project trajectory into anchor-relative coordinates
projected = cvx.project_to_anchors(
trajectory, # (T, D) array — original trajectory
anchors, # (K, D) array — reference embeddings
metric='cosine' # 'cosine' | 'l2'
)
# Returns: (T, K) array — trajectory in ℝᴷ
# Summarize per-anchor dynamics
summary = cvx.anchor_summary(projected)
# Returns: dict per anchor with {mean, min, trend, last}
# All existing CVX functions work on the projected trajectory:
vel = cvx.velocity(projected, timestamps) # velocity in ℝᴷ
cps = cvx.detect_changepoints("user", projected) # regime changes in anchor space
H = cvx.hurst_exponent(projected) # persistence of anchor-relative drift
sig = cvx.path_signature(projected, depth=2) # signature over anchor distances
  • Module: cvx-analytics::anchor
  • Dependencies: Uses existing drift_magnitude_cosine / drift_magnitude_l2 from cvx-analytics::drift
  • Core logic: For each time step tt and anchor kk, compute the chosen distance metric between xt\mathbf{x}_t and ak\mathbf{a}_k. Assemble into a (T,K)(T, K) matrix.
  • Complexity: O(TKD)O(T \cdot K \cdot D) — linear in all dimensions, no graph lookups needed

On the eRisk 2018 depression detection task, anchor-relative features substantially improved classification:

ConfigurationPrecisionRecallF1AUC
B1 baseline (absolute space)0.6670.5450.6000.639
B2 combined (with anchor projection)0.7140.8570.7810.863
Improvement+0.047+0.312+0.181+0.224

Anchor projection makes drift toward or away from known clinical poles directly measurable, providing the reference frame that raw embeddings lack.

PhaseScopeEffort
1project_to_anchors in Rust (cvx-analytics::anchor)Low
2anchor_summary aggregationLow
3Python bindings via PyO3Low
4Integration tests with eRisk dataMedium
5Tutorial B2 with anchor-projected analyticsMedium