Skip to content

Clinical Dashboard (B3)

This notebook applies CVX’s temporal analytics to real eRisk depression data (1.36M Reddit posts, 2,285 users) using centered DSM-5 anchor projections — the key technique that transforms useless raw cosine distances into clinically discriminative symptom profiles.

Dataset: eRisk 2017+2018+2022 (233 depression + 233 control users, 225K posts, MentalRoBERTa D=768)

CVX Features: project_to_anchors, anchor_summary, compute_centroid, regions, region_assignments, velocity, detect_changepoints, drift


1. DSM-5 Symptom Proximity — Population Average

Section titled “1. DSM-5 Symptom Proximity — Population Average”

After centering (subtracting the global mean embedding), we project each user’s trajectory onto 9 DSM-5 symptom anchors + 1 healthy baseline. The radar chart shows the mean proximity of depression vs control users across all symptoms.

Depression users show significantly higher proximity to all symptom anchors — particularly depressed_mood, worthlessness, and anhedonia. Control users cluster near the healthy baseline.


2. Symptom Drift Direction — Who Is Approaching Symptoms?

Section titled “2. Symptom Drift Direction — Who Is Approaching Symptoms?”

Beyond static proximity, we measure the trend (linear slope over normalized time) of each user’s symptom distances. Negative trend = approaching the symptom over time.

This reveals which symptoms show active deterioration vs static elevation — critical for early intervention.


Small multiples showing how proximity to each DSM-5 symptom evolves from beginning (0%) to end (100%) of each user’s post history. Red = depression, blue = control.

The separation between groups is visible across all symptoms, with the most dramatic divergence in depressed_mood and worthlessness.


The HNSW hierarchy provides unsupervised semantic clustering. Each bubble is a region hub — hover to see the depression ratio, member count, and clinical keywords of posts assigned to that region.

Regions naturally specialize: some have depression ratios >80% (posts about hopelessness, isolation), others <20% (social activities, hobbies). No labels were used during construction.


A 4-panel aligned view for a single depression user showing how symptoms, velocity, change points, and text content evolve together:

Clinical Timeline — Control User (comparison)

Section titled “Clinical Timeline — Control User (comparison)”

The control user shows stable symptom distances and lower velocity throughout — no approaching behavior, no change points.


Polarization measures whether a user’s semantic space is contracting (becoming obsessively focused) or expanding (maintaining diverse topics).

Dispersion ratio < 1.0 = semantic world is shrinking. Computed as std(second_half) / std(first_half) of the trajectory vectors.


Heatmap showing depression_mean - control_mean for each symptom at each time point. Red = depression users are closer to the symptom than controls.

The top 4 most discriminative symptoms over time:


Combined view for a single user: symptom radar + drift + polarization + timeline.