Stochastic Processes for Embeddings
Why Stochastic Processes?
Section titled “Why Stochastic Processes?”ChronosVector already computes first-order temporal analytics: velocity, acceleration, change points, and cohort divergence. These answer what changed and when. But they are deterministic and descriptive — they cannot distinguish real signal from noise, model uncertainty, or capture complex phenomena like volatility clustering or long-range dependence.
The key insight is that an embedding trajectory can be modeled as a stochastic process. In the most general diffusion formulation:
where:
- is the drift function — the systematic, directional component of change. This is the “signal” in the trajectory.
- is the diffusion/volatility function — the stochastic fluctuation, the “noise” in the trajectory.
- is a -dimensional Wiener process (standard Brownian motion).
This is not merely an analogy. Embedding trajectories exhibit many of the same statistical properties as financial time series: periods of stability and turbulence, volatility clustering, mean reversion toward equilibria, and increments that are rarely i.i.d. Gaussian. The quantitative finance literature provides decades of battle-tested tools for exactly these phenomena (Bamler & Mandt, 2017; Hamilton, 1989).
Drift Significance Test
Section titled “Drift Significance Test”CVX computes velocity (drift rate) as a first-class analytic. But a critical question remains: is the observed velocity statistically significant, or could it arise from a pure random walk?
An entity with a drift rate of 0.01 per timestep could be undergoing genuine directional change, or simply fluctuating randomly. The distinction has profound implications for interpretation and action.
Method
Section titled “Method”Under the null hypothesis : no drift (pure random walk), the increments have zero mean. The test statistic is:
where is the mean increment magnitude, is the sample standard deviation of increments, and is the number of increments. Under , this follows a -distribution with degrees of freedom.
For multivariate drift, CVX uses the Hotelling test on the vector of mean increments, which reduces to the scalar -test when applied to drift magnitudes.
The result includes both statistical significance (-value) and practical significance (Cohen’s effect size), because a statistically significant but tiny drift may not be actionable.
Realized Volatility
Section titled “Realized Volatility”In finance, volatility (the standard deviation of returns) is arguably the most important metric after the return itself. For embedding trajectories, volatility measures the variability of change — not how much the entity changed on average, but how erratic that change was.
Estimators
Section titled “Estimators”CVX provides multiple volatility estimators, each with different properties:
| Estimator | Formula | What it captures |
|---|---|---|
| Scalar realized volatility | Overall trajectory roughness | |
| Per-dimension volatility | for each | Which dimensions are volatile |
| Annualized/normalized | Comparable across sampling frequencies |
The volatility of volatility (vol-of-vol) measures meta-stability: high vol-of-vol indicates that the volatility itself is unstable, suggesting regime transitions or structural breaks in the trajectory’s dynamics.
GARCH Volatility Model
Section titled “GARCH Volatility Model”Volatility is not constant over time. A well-documented phenomenon in finance — and observable in embedding trajectories — is volatility clustering: periods of high volatility tend to follow periods of high volatility, and vice versa (Engle, 1982; Bollerslev, 1986).
The GARCH(1,1) model (Generalized Autoregressive Conditional Heteroskedasticity) captures this clustering:
where:
- is the long-run variance weight (intercept)
- measures the reaction to recent shocks — the innovation coefficient
- measures the persistence of past volatility — the lag coefficient
- are the standardized residuals
Interpreting GARCH Parameters
Section titled “Interpreting GARCH Parameters”The persistence is the most informative parameter:
| Persistence | Interpretation |
|---|---|
| Integrated GARCH — volatility shocks are nearly permanent | |
| High persistence — shocks decay slowly | |
| Moderate — shocks decay at medium speed | |
| Low persistence — volatility reverts quickly to long-run level |
The half-life of a volatility shock tells you how long a perturbation lasts:
The long-run (unconditional) volatility is , which gives the equilibrium volatility level the process reverts to.
Estimation
Section titled “Estimation”The model is estimated by maximum likelihood (MLE). Scalar increments are modeled as with . The conditional Gaussian log-likelihood:
is optimized with L-BFGS-B subject to constraints , , , .
Mean Reversion and the Ornstein-Uhlenbeck Process
Section titled “Mean Reversion and the Ornstein-Uhlenbeck Process”A fundamental question about any embedding trajectory: does it revert to an equilibrium, wander freely (random walk), or trend persistently?
| Classification | Meaning | Implication |
|---|---|---|
| Mean-reverting | Current position will revert to an equilibrium | Deviations are temporary — the entity “wants” to return |
| Random walk | No equilibrium — position is unpredictable | Past trajectory does not inform future position |
| Trending | Persistent directional movement | Momentum — current direction likely to continue |
Stationarity Tests
Section titled “Stationarity Tests”Two complementary tests provide robust classification:
Augmented Dickey-Fuller (ADF) test — tests whether the series has a unit root:
- : unit root (random walk)
- Rejection implies mean-reverting (stationary)
- Test regression:
KPSS test (Kwiatkowski-Phillips-Schmidt-Shin) — tests the opposite null:
- : stationary (mean-reverting)
- Rejection implies unit root or trend
Using both tests together yields a 2x2 classification matrix:
| ADF rejects? | KPSS rejects? | Classification |
|---|---|---|
| Yes | No | Mean-Reverting (stationary) |
| No | Yes | Random Walk (unit root) |
| Yes | Yes | Trending (trend-stationary) |
| No | No | Inconclusive |
Ornstein-Uhlenbeck Parameters
Section titled “Ornstein-Uhlenbeck Parameters”When mean reversion is detected, CVX estimates the parameters of the Ornstein-Uhlenbeck (OU) process:
- = speed of mean reversion (higher means faster reversion)
- = equilibrium position (the attractor)
- = diffusion coefficient (residual volatility after accounting for reversion)
- Half-life — how long it takes to revert halfway to equilibrium
The half-life is particularly actionable: a concept with a half-life of 5 time units reverts quickly to its semantic center, while one with a half-life of 500 is effectively a random walk over practical horizons.
Hurst Exponent
Section titled “Hurst Exponent”The Hurst exponent measures the roughness or memory of a trajectory. It is a fundamental quantity that distinguishes three regimes:
| Hurst Value | Classification | Meaning |
|---|---|---|
| Random (Brownian) | Pure random walk — no memory, increments are i.i.d. | |
| Persistent (trending) | Momentum — past direction predicts future direction | |
| Anti-persistent (rough) | Mean-reverting at small scales — past direction predicts reversal |
A notable finding in finance: realized volatility has (very rough), which motivated the rough volatility theory (Gatheral et al., 2018). Embedding trajectories may exhibit similar roughness, with implications for prediction and modeling.
Estimation: Detrended Fluctuation Analysis (DFA)
Section titled “Estimation: Detrended Fluctuation Analysis (DFA)”- Compute the cumulative deviation from the mean:
- Divide into windows of size
- In each window, fit a polynomial trend and compute the residual variance
- The Hurst exponent satisfies the scaling law:
- Estimate from the slope of vs
CVX also supports R/S (rescaled range) analysis as a simpler alternative, though DFA is more robust to trends.
Unified Stationarity Classification
Section titled “Unified Stationarity Classification”CVX combines all the above analyses into a unified process classification that categorizes each entity’s trajectory:
| Classification | Conditions | Description |
|---|---|---|
| StableEquilibrium | Mean-reverting, low volatility, | Fluctuates around a stable attractor |
| RandomWalk | No significant drift, no reversion, | Unpredictable — past does not inform future |
| TrendingWithMomentum | Significant drift, | Persistent directional movement |
| VolatileCycling | Mean-reverting but GARCH persistence | Cycles with episodic volatility bursts |
| RegimeTransition | Mixed signals across tests | Stochastic character changed during the window |
The classification follows a decision tree that combines drift significance (-value), ADF/KPSS results, Hurst exponent, and GARCH persistence into a single actionable label with a human-readable summary.
Why Financial Tools Apply to Embeddings
Section titled “Why Financial Tools Apply to Embeddings”The application of quantitative finance tools to embedding trajectories is not a forced analogy. Both domains share structural properties:
- Non-stationarity. Both financial returns and embedding increments exhibit time-varying statistical properties.
- Volatility clustering. Periods of rapid semantic change (e.g., during major events) cluster together, just as market volatility clusters around crises.
- Mean reversion. Many concepts revert to semantic equilibria after perturbations, analogous to mean-reverting assets.
- Heavy tails. Embedding increments, like financial returns, exhibit heavier tails than the Gaussian distribution.
- Long-range dependence. Hurst exponents different from 0.5 indicate that embedding trajectories have memory, just as financial series do.
The key difference is interpretive: in finance, these tools inform trading decisions; in CVX, they inform understanding of semantic evolution, drift detection, and predictive modeling.
References
Section titled “References”- Bamler, R. & Mandt, S. (2017). Dynamic Word Embeddings. ICML 2017.
- Rosenfeld, A. & Erk, K. (2018). Deep Neural Models of Semantic Shift. NAACL 2018.
- Hamilton, J.D. (1989). A New Approach to the Economic Analysis of Nonstationary Time Series and the Business Cycle. Econometrica.
- Engle, R.F. (1982). Autoregressive Conditional Heteroscedasticity with Estimates of the Variance of United Kingdom Inflation. Econometrica.
- Bollerslev, T. (1986). Generalized Autoregressive Conditional Heteroskedasticity. Journal of Econometrics.
- Gatheral, J., Jaisson, T., & Rosenbaum, M. (2018). Volatility is Rough. Quantitative Finance.
- Dickey, D.A. & Fuller, W.A. (1979). Distribution of the Estimators for Autoregressive Time Series with a Unit Root. JASA.
- Kwiatkowski, D., Phillips, P.C.B., Schmidt, P., & Shin, Y. (1992). Testing the Null Hypothesis of Stationarity. Journal of Econometrics.