Skip to content

Anomaly Detection (NAB)

Notebook: notebooks/T_NAB_Anomaly.ipynb

Anomaly detection in time series is traditionally framed as a statistical outlier problem — deviations from a learned normal distribution. ChronosVector reframes this as trajectory geometry: anomalies are points where the trajectory’s geometric properties — velocity, curvature, anchor deviation, or topological structure — deviate from expected behavior.

We evaluate CVX on the Numenta Anomaly Benchmark (NAB), a standard benchmark comprising 58 time series across 7 domains (cloud metrics, traffic, tweets, temperature, etc.) with 116 labeled anomaly windows. CVX applies four complementary detection strategies: (1) velocity spikes indicating sudden trajectory acceleration, (2) anchor deviation from a learned normal-behavior reference, (3) PELT changepoints marking regime shifts, and (4) topological disruption via persistence-based features. The pipeline is functional on 31 series with multi-threshold evaluation. Scoring refinement against NAB’s weighted scoring protocol is ongoing.


Lavin and Ahmad (2015) introduced the Numenta Anomaly Benchmark as a standardized evaluation framework for real-time anomaly detection. NAB provides a scoring protocol that rewards early detection (detecting an anomaly before its labeled window) and penalizes false positives with configurable profiles (standard, reward low FP, reward low FN). Leading methods on NAB include:

  • HTM (Hierarchical Temporal Memory): Numenta’s own cortically-inspired model, which learns temporal sequences and flags prediction errors.
  • Random Cut Forest (RCF): Amazon’s ensemble of random trees that isolates anomalies via path length (Guha et al., 2016).
  • Twitter ADVec: Seasonal-hybrid ESD for detecting anomalies in periodic signals.

Takens’ theorem (1981) establishes that a time series can be embedded into a higher-dimensional phase space via delay coordinates, reconstructing the topology of the underlying dynamical system. This provides theoretical grounding for CVX’s approach: by constructing delay-embedded trajectories from univariate series, we gain access to geometric properties (velocity, curvature, attractor structure) that are invisible in the original 1D signal.

Persistent homology has been applied to anomaly detection by tracking the birth/death of topological features (connected components, loops) in sliding windows over time series (Perea & Harer, 2015). Anomalies correspond to sudden changes in the persistence diagram — new topological features appearing or disappearing.

CVX’s Contribution. CVX unifies delay embedding, velocity/acceleration analysis, changepoint detection, and topological features in a single trajectory-native engine. Rather than treating these as separate pipelines, CVX computes them as complementary views of the same underlying trajectory geometry.


Each univariate NAB time series is converted to a trajectory via delay embedding:

ParameterValueRationale
Window (W)20Captures sufficient temporal context for most NAB series
Stride1Maximum resolution
DimensionalityD = 20Each point is a length-20 window of the original series

This transforms a 1D time series of length N into a trajectory of N-W+1 points in R^20.

CVX applies four independent anomaly scoring strategies:

Compute velocity() between consecutive trajectory points. Anomalies produce sudden accelerations — the trajectory moves faster than expected given the learned baseline dynamics.

Define a normal-behavior anchor from the first 15% of each series (assumed anomaly-free). Compute drift() from this anchor at each time step. Anomalies appear as sudden increases in anchor distance.

Apply detect_changepoints() (PELT) to the trajectory. Changepoints mark moments where the trajectory’s statistical properties shift — mean, variance, or direction. Each changepoint receives a severity score proportional to the geometric displacement.

Compute local persistence features using sliding windows over the trajectory. Anomalies correspond to sudden changes in the topological structure — new cycles appearing or the attractor geometry deforming.

The four strategy scores are combined via weighted averaging:

score(t) = w1 * velocity_score(t) + w2 * anchor_score(t)
+ w3 * changepoint_score(t) + w4 * topology_score(t)

Multiple threshold levels are evaluated against NAB’s scoring protocol to generate detection windows.

CVX FunctionPurposeStrategy
cvx.ingest()Load delay-embedded vectorsAll
cvx.velocity()Trajectory speed between stepsVelocity Spikes
cvx.drift()Distance from normal anchorAnchor Deviation
cvx.detect_changepoints()PELT regime boundariesChangepoints
cvx.hurst_exponent()Memory structure per windowTopology
cvx.trajectory()Full path extractionAll
cvx.path_signature()Local geometric featuresTopology

The CVX anomaly detection pipeline is functional on 31 of 58 NAB series (the remaining 27 require domain-specific preprocessing for periodic signals). Current results represent the combined scoring approach with default weights.

NAB DomainSeriesDetectedCoverage
Cloud (AWS)171482%
Traffic7686%
Tweets10550%
Temperature3267%
CPU/Machine6467%
Exchange Rate500%
Art. (no anomaly)10
StrategyAvg PrecisionAvg RecallBest Domain
Velocity Spikes0.680.42Cloud metrics
Anchor Deviation0.550.61Traffic
Changepoints0.720.38Temperature
Topological0.480.35Tweets
Combined0.630.51Cloud
  • Scoring calibration: NAB’s weighted scoring protocol penalizes late detection; current thresholds need tuning per domain.
  • Periodic series: Tweet counts and some traffic series have strong seasonality that the delay embedding does not yet detrend.
  • Exchange rate series: Very low signal-to-noise ratio; anomalies are subtle distributional shifts rather than geometric discontinuities.

The notebook produces the following interactive visualizations:

  • Detection Timeline: Original series with overlaid velocity, anchor deviation, and changepoint scores
  • Delay Embedding 3D: PCA projection of the 20-D trajectory colored by anomaly score
  • Strategy Comparison: Side-by-side score profiles for each detection strategy
  • Threshold Sensitivity: Precision-recall curves at multiple detection thresholds

Terminal window
# Install dependencies
pip install chronos-vector NAB plotly scikit-learn
# Run analysis
cd notebooks && jupyter notebook T_NAB_Anomaly.ipynb

Requirements: ~4 GB RAM for loading all NAB series, ~10 min CVX ingestion.