MAP-Elites Archive

MAP-Elites: The Dimension of Change

MAP-Elites maintains an archive of diverse, high-performing solutions indexed by behavioral descriptors. Traditional implementations use a pre-discretized grid and linear scans for novelty scoring:

Operation	Grid Archive	CVX Archive
Niche lookup	O(1) fixed grid	O(log N) HNSW
Novelty (kNN)	O(N) brute force	O(log N) HNSW
Niche count	Pre-specified K	Emergent (graph levels)
Descriptor dim	Low (2-3)	High (20+)

ChronosVector replaces the archive’s flat data structure with a temporal HNSW index:

HNSW regions at different graph levels form natural niches (no cell count needed)
kNN search in O(log N) enables novelty scoring at scale
Temporal tracking lets us observe how solutions evolve across generations

The key insight: an entity’s trajectory through descriptor space across generations reveals the shape of evolutionary exploration. Path signatures, Hurst exponents, and distributional drift quantify this shape rigorously.

This notebook uses fully synthetic data — no external dependencies beyond chronos_vector, numpy, and plotly.

1. Setup and Synthetic Data Generation

We define a simple fitness landscape with 5 ground-truth niche centers in a 20-dimensional descriptor space. Fitness is the (negated) distance to the nearest niche center — closer to any center means higher fitness.

import chronos_vector as cvx
import numpy as np
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import time

np.random.seed(42)

# Parameters
D = 20              # descriptor dimensionality (tractable for path signatures)
N_GENERATIONS = 200 # evolutionary generations
POP_SIZE = 50       # offspring per generation
N_NICHES = 5        # ground-truth clusters in fitness landscape

# Color scheme
C_HIGHLIGHT = '#e74c3c'
C_SECONDARY = '#3498db'
C_ACCENT    = '#2ecc71'
C_WARN      = '#f39c12'
C_MUTED     = '#95a5a6'

# Generate ground-truth niche centers
niche_centers = np.random.randn(N_NICHES, D) * 3.0

def fitness(descriptor):
    """Fitness = negative distance to nearest niche center (higher is better)."""
    dists = np.linalg.norm(niche_centers - descriptor, axis=1)
    return -dists.min()

print(f'Descriptor space: D={D}')
print(f'Niche centers: {N_NICHES} clusters in R^{D}')
print(f'Evolution: {N_GENERATIONS} generations x {POP_SIZE} offspring = {N_GENERATIONS * POP_SIZE:,} total solutions')
print(f'Niche center norms: {[f"{np.linalg.norm(c):.2f}" for c in niche_centers]}')

Descriptor space: D=20
Niche centers: 5 clusters in R^20
Evolution: 200 generations x 50 offspring = 10,000 total solutions
Niche center norms: ['12.76', '13.15', '10.74', '14.55', '9.04']

2. Build the CVX Archive

Each solution gets a unique entity_id, its generation as timestamp, and its descriptor as the vector. Early generations explore randomly; later generations perturb existing solutions (mutation).

We use bulk_insert per generation for efficiency.

index = cvx.TemporalIndex(m=16, ef_construction=100, ef_search=50)

solution_counter = 0
archive_fitness = {}       # solution_id -> fitness
solution_generation = {}   # solution_id -> generation
generation_best = []       # best fitness per generation
generation_mean = []       # mean fitness per generation

t0 = time.perf_counter()

for gen in range(N_GENERATIONS):
    descriptors = []
    ids = []
    gen_fits = []

    for _ in range(POP_SIZE):
        if gen < 5 or np.random.rand() < 0.1:
            # Random exploration
            desc = np.random.randn(D) * 2.0
        else:
            # Mutate an existing solution
            parent_id = np.random.randint(0, solution_counter)
            parent_traj = index.trajectory(entity_id=parent_id)
            if parent_traj:
                parent_vec = np.array(parent_traj[-1][1])
                desc = parent_vec + np.random.randn(D) * 0.3
            else:
                desc = np.random.randn(D) * 2.0

        fit = fitness(desc)
        descriptors.append(desc.astype(np.float32))
        ids.append(solution_counter)
        archive_fitness[solution_counter] = fit
        solution_generation[solution_counter] = gen
        gen_fits.append(fit)
        solution_counter += 1

    # Bulk insert this generation
    entity_ids = np.array(ids, dtype=np.uint64)
    timestamps = np.full(len(ids), gen, dtype=np.int64)
    vectors = np.array(descriptors, dtype=np.float32)
    index.bulk_insert(entity_ids, timestamps, vectors)

    generation_best.append(max(gen_fits))
    generation_mean.append(np.mean(gen_fits))

elapsed = time.perf_counter() - t0
print(f'Archive: {len(index):,} solutions over {N_GENERATIONS} generations')
print(f'Ingestion time: {elapsed:.2f}s ({len(index)/elapsed:,.0f} solutions/sec)')
print(f'Best fitness: {max(archive_fitness.values()):.4f}')
print(f'Mean fitness (final gen): {generation_mean[-1]:.4f}')

Archive: 10,000 solutions over 200 generations
Ingestion time: 0.86s (11,592 solutions/sec)
Best fitness: -7.1097
Mean fitness (final gen): -12.3309

# Fitness progress over generations
fig = go.Figure()
fig.add_trace(go.Scatter(
    x=list(range(N_GENERATIONS)), y=generation_best,
    mode='lines', name='Best fitness',
    line=dict(color=C_HIGHLIGHT, width=2)
))
fig.add_trace(go.Scatter(
    x=list(range(N_GENERATIONS)), y=generation_mean,
    mode='lines', name='Mean fitness',
    line=dict(color=C_SECONDARY, width=1.5, dash='dot')
))
fig.update_layout(
    template='plotly_dark',
    title='Fitness Progress Across Generations',
    xaxis_title='Generation',
    yaxis_title='Fitness (higher = better)',
    height=400,
    legend=dict(x=0.7, y=0.02)
)
fig.show()

3. Niche Discovery via HNSW Regions

The HNSW graph has a natural hierarchy: higher levels contain fewer hub nodes, each representing a region of descriptor space. These regions are emergent niches — no need to pre-specify grid resolution or cluster count.

# Discover regions at multiple HNSW levels
for level in [1, 2, 3]:
    regions = index.regions(level=level)
    print(f'Level {level}: {len(regions)} niches')

# Use level 2 as our working niche granularity
regions_l2 = index.regions(level=2)
print(f'\nWorking with Level 2: {len(regions_l2)} niches')

# Extract centroids and sizes
region_ids = [rid for rid, _, _ in regions_l2]
centroids  = [c   for _, c, _ in regions_l2]
n_members  = [n   for _, _, n in regions_l2]

print(f'Niche sizes: min={min(n_members)}, max={max(n_members)}, '
      f'median={sorted(n_members)[len(n_members)//2]}')

Level 1: 644 niches
Level 2: 42 niches
Level 3: 3 niches

Working with Level 2: 42 niches
Niche sizes: min=10, max=888, median=166

# PCA projection of region centroids for visualization
centroid_matrix = np.array(centroids)
centroid_mean = centroid_matrix.mean(axis=0)
centroid_centered = centroid_matrix - centroid_mean

# SVD-based PCA (no sklearn needed)
U, S, Vt = np.linalg.svd(centroid_centered, full_matrices=False)
pca_2d = centroid_centered @ Vt[:2].T
variance_explained = (S[:2]**2) / (S**2).sum()

# Also project ground-truth niche centers
niche_pca = (niche_centers - centroid_mean) @ Vt[:2].T

fig = go.Figure()

# HNSW regions (sized by member count)
fig.add_trace(go.Scatter(
    x=pca_2d[:, 0], y=pca_2d[:, 1],
    mode='markers',
    marker=dict(
        size=np.sqrt(np.array(n_members)) * 2,
        color=np.array(n_members),
        colorscale='Viridis',
        colorbar=dict(title='Members'),
        opacity=0.7,
        line=dict(width=0.5, color='white')
    ),
    name='HNSW regions',
    text=[f'Region {rid}: {n} members' for rid, n in zip(region_ids, n_members)],
    hovertemplate='%{text}<br>PC1=%{x:.2f}<br>PC2=%{y:.2f}<extra></extra>'
))

# Ground-truth niche centers
fig.add_trace(go.Scatter(
    x=niche_pca[:, 0], y=niche_pca[:, 1],
    mode='markers+text',
    marker=dict(size=18, color=C_HIGHLIGHT, symbol='star', line=dict(width=2, color='white')),
    text=[f'Niche {i}' for i in range(N_NICHES)],
    textposition='top center',
    textfont=dict(color=C_HIGHLIGHT, size=11),
    name='Ground-truth niches'
))

fig.update_layout(
    template='plotly_dark',
    title=f'HNSW Regions (Level 2) vs Ground-Truth Niches<br>'
          f'<sub>PCA: {variance_explained[0]:.1%} + {variance_explained[1]:.1%} variance explained</sub>',
    xaxis_title=f'PC1 ({variance_explained[0]:.1%})',
    yaxis_title=f'PC2 ({variance_explained[1]:.1%})',
    height=550,
    showlegend=True
)
fig.show()

4. Novelty Scoring: O(log N) via HNSW

Novelty in MAP-Elites is typically the mean distance to the k nearest neighbors in the archive. With a flat list, this is O(N) per query. CVX provides O(log N) kNN via HNSW — critical when the archive grows to millions of solutions.

# Benchmark: HNSW kNN for novelty scoring
query = np.random.randn(D).astype(np.float32).tolist()

n_queries = 1000
t0 = time.perf_counter()
for _ in range(n_queries):
    results = index.search(vector=query, k=15)
hnsw_time = (time.perf_counter() - t0) / n_queries

# Novelty = mean distance to k nearest neighbors
novelty = np.mean([score for _, _, score in results])

print(f'HNSW kNN (k=15): {hnsw_time*1000:.3f} ms per query')
print(f'Novelty score (sample query): {novelty:.4f}')
print(f'Archive size: {len(index):,}')
print(f'Complexity: O(log {len(index):,}) = O({np.log2(len(index)):.0f}) vs O({len(index):,}) brute force')
print(f'Speedup estimate: ~{len(index) / np.log2(len(index)):.0f}x')

HNSW kNN (k=15): 0.067 ms per query
Novelty score (sample query): 20.7780
Archive size: 10,000
Complexity: O(log 10,000) = O(13) vs O(10,000) brute force
Speedup estimate: ~753x

# Novelty distribution across the archive (sample 500 solutions)
sample_ids = np.random.choice(solution_counter, size=min(500, solution_counter), replace=False)
novelty_scores = []
fitness_vals = []

for sid in sample_ids:
    traj = index.trajectory(entity_id=int(sid))
    if traj:
        vec = traj[-1][1]  # latest descriptor
        results = index.search(vector=vec, k=15)
        nov = np.mean([s for _, _, s in results])
        novelty_scores.append(nov)
        fitness_vals.append(archive_fitness[int(sid)])

fig = go.Figure()
fig.add_trace(go.Scatter(
    x=fitness_vals, y=novelty_scores,
    mode='markers',
    marker=dict(size=4, color=C_SECONDARY, opacity=0.5),
    name='Solutions'
))
fig.update_layout(
    template='plotly_dark',
    title='Fitness vs Novelty: The Quality-Diversity Trade-off',
    xaxis_title='Fitness (higher = better)',
    yaxis_title='Novelty (mean kNN distance)',
    height=450
)
fig.show()

corr = np.corrcoef(fitness_vals, novelty_scores)[0, 1]
print(f'Fitness-Novelty correlation: r={corr:.3f}')
print(f'  Negative correlation expected: high-fitness solutions cluster near niche centers')

Fitness-Novelty correlation: r=-0.052
  Negative correlation expected: high-fitness solutions cluster near niche centers

5. Exploration Dynamics: The Shape of Search

CVX tracks each solution as a temporal entity. Even though each solution has a single descriptor, lineages (parent-child chains) can be traced through the mutation operator. Here we analyze exploration dynamics using CVX’s temporal analytics.

Velocity: How fast is exploration moving?

# For trajectory analysis, we need entities with multiple observations.
# In MAP-Elites, each solution is unique. To study dynamics, we create
# "lineage trajectories" by inserting parent->child chains as the same entity.
# Here we analyze the entities that happen to have trajectory data.

# Find entities with enough trajectory points
entities_with_traj = []
for eid in range(min(200, solution_counter)):
    traj = index.trajectory(entity_id=eid)
    if len(traj) >= 3:
        entities_with_traj.append((eid, len(traj)))

print(f'Entities with >= 3 trajectory points: {len(entities_with_traj)}')
if entities_with_traj:
    lens = [l for _, l in entities_with_traj]
    print(f'Trajectory lengths: min={min(lens)}, max={max(lens)}, mean={np.mean(lens):.1f}')

# Compute velocity for entities with sufficient trajectory
velocities = []
for eid, tlen in entities_with_traj[:50]:
    traj = index.trajectory(entity_id=eid)
    mid_ts = traj[len(traj)//2][0]
    try:
        vel = cvx.velocity(traj, timestamp=mid_ts)
        velocities.append((eid, np.linalg.norm(vel)))
    except Exception:
        pass

if velocities:
    vel_norms = [v for _, v in velocities]
    print(f'\nExploration velocity (n={len(velocities)}):')
    print(f'  Mean: {np.mean(vel_norms):.4f}')
    print(f'  Std:  {np.std(vel_norms):.4f}')
    print(f'  Max:  {max(vel_norms):.4f}')
else:
    print('\nNo entities with sufficient trajectory for velocity computation.')
    print('Each MAP-Elites solution is a unique entity -- this is expected.')
    print('The key CVX analytics apply to the ARCHIVE-LEVEL dynamics below.')

Entities with >= 3 trajectory points: 0

No entities with sufficient trajectory for velocity computation.
Each MAP-Elites solution is a unique entity -- this is expected.
The key CVX analytics apply to the ARCHIVE-LEVEL dynamics below.

Hurst Exponent: Persistent or Anti-Persistent Exploration?

The Hurst exponent H characterizes trajectory memory:

H > 0.5: Persistent — exploration trends into new areas
H = 0.5: Random walk — no memory
H < 0.5: Anti-persistent — oscillates, revisits old areas

# Hurst exponent requires trajectories with >= 10 points
hurst_results = []
for eid in range(min(500, solution_counter)):
    traj = index.trajectory(entity_id=eid)
    if len(traj) >= 10:
        try:
            h = cvx.hurst_exponent(traj)
            hurst_results.append((eid, h))
        except Exception:
            pass

if hurst_results:
    h_vals = [h for _, h in hurst_results]
    print(f'Hurst exponent (n={len(hurst_results)} entities with >= 10 points):')
    print(f'  Mean: {np.mean(h_vals):.3f}')
    print(f'  Std:  {np.std(h_vals):.3f}')
    print(f'  H > 0.5 = persistent exploration')
    print(f'  H < 0.5 = anti-persistent (oscillating)')
    print(f'  H = 0.5 = random walk')

    fig = go.Figure()
    fig.add_trace(go.Histogram(
        x=h_vals, nbinsx=30,
        marker_color=C_SECONDARY, opacity=0.8,
        name='Hurst exponent'
    ))
    fig.add_vline(x=0.5, line_dash='dash', line_color=C_HIGHLIGHT,
                  annotation_text='Random walk (H=0.5)')
    fig.update_layout(
        template='plotly_dark',
        title='Hurst Exponent Distribution: Memory of Exploration',
        xaxis_title='Hurst exponent',
        yaxis_title='Count',
        height=400
    )
    fig.show()
else:
    print('No entities with >= 10 trajectory points for Hurst analysis.')
    print('In MAP-Elites each solution is a unique entity (1 point each).')
    print('Hurst analysis applies to lineage-tracked or re-evaluated entities.')

No entities with >= 10 trajectory points for Hurst analysis.
In MAP-Elites each solution is a unique entity (1 point each).
Hurst analysis applies to lineage-tracked or re-evaluated entities.

6. Distributional Evolution: How Does the Archive Shift?

Beyond individual solutions, we can track how the distribution of solutions across niches evolves over generations. CVX’s region_trajectory computes EMA-smoothed region distributions at each timestamp.

# Region trajectory for a high-fitness entity
best_id = max(archive_fitness, key=archive_fitness.get)
print(f'Highest-fitness solution: entity {best_id} (fitness={archive_fitness[best_id]:.4f})')

reg_traj = index.region_trajectory(
    entity_id=best_id, level=2, window_days=10, alpha=0.3
)
print(f'Region trajectory length: {len(reg_traj)} snapshots')

if len(reg_traj) >= 3:
    # Wasserstein drift between early and late distributions
    p_early = list(reg_traj[0][1])
    p_late  = list(reg_traj[-1][1])

    wd = cvx.wasserstein_drift(p_early, p_late, centroids, 50)
    fr = cvx.fisher_rao_distance(
        [float(x) for x in p_early],
        [float(x) for x in p_late]
    )

    print(f'\nDistributional drift (early -> late):')
    print(f'  Wasserstein distance: {wd:.4f}')
    print(f'  Fisher-Rao distance:  {fr:.4f}')
    print(f'  Fisher-Rao range: [0, pi={np.pi:.3f}], higher = more different')
else:
    print('Insufficient region trajectory data for this entity.')
    print('Trying a different entity with more activity...')
    # Find an entity with more trajectory data
    for eid in range(min(1000, solution_counter)):
        rt = index.region_trajectory(entity_id=eid, level=2, window_days=10, alpha=0.3)
        if len(rt) >= 3:
            reg_traj = rt
            p_early = list(rt[0][1])
            p_late = list(rt[-1][1])
            wd = cvx.wasserstein_drift(p_early, p_late, centroids, 50)
            fr = cvx.fisher_rao_distance(
                [float(x) for x in p_early],
                [float(x) for x in p_late]
            )
            print(f'  Entity {eid}: {len(rt)} snapshots')
            print(f'  Wasserstein: {wd:.4f}, Fisher-Rao: {fr:.4f}')
            break

Highest-fitness solution: entity 7401 (fitness=-7.1097)
Region trajectory length: 1 snapshots
Insufficient region trajectory data for this entity.
Trying a different entity with more activity...

# Heatmap: region distribution over generations
# Build a generation-by-region matrix from the archive
n_regions = len(regions_l2)
gen_region_matrix = np.zeros((N_GENERATIONS, n_regions))

# For each region, count how many solutions from each generation belong to it
for r_idx, (rid, _, _) in enumerate(regions_l2):
    members = index.region_members(region_id=rid, level=2)
    for eid, ts in members:
        gen = int(ts)
        if 0 <= gen < N_GENERATIONS:
            gen_region_matrix[gen, r_idx] += 1

# Normalize per generation (row-wise)
row_sums = gen_region_matrix.sum(axis=1, keepdims=True)
row_sums[row_sums == 0] = 1
gen_region_norm = gen_region_matrix / row_sums

# Sort regions by total activity for cleaner visualization
region_activity = gen_region_matrix.sum(axis=0)
top_k = min(30, n_regions)  # show top 30 regions
top_idx = np.argsort(region_activity)[-top_k:]

fig = go.Figure(data=go.Heatmap(
    z=gen_region_norm[:, top_idx].T,
    x=list(range(N_GENERATIONS)),
    y=[f'R{region_ids[i]}' for i in top_idx],
    colorscale='Inferno',
    colorbar=dict(title='Proportion')
))
fig.update_layout(
    template='plotly_dark',
    title=f'Archive Distribution: Top {top_k} Regions Over Generations',
    xaxis_title='Generation',
    yaxis_title='Region',
    height=600
)
fig.show()

print(f'Regions visualized: {top_k} of {n_regions} (sorted by total activity)')

Regions visualized: 30 of 42 (sorted by total activity)

7. Archive Topology: Connected or Fragmented?

Persistent homology tracks how the number of connected components (Betti-0) changes as we grow a ball around each region centroid. This reveals whether the archive is a single connected cluster, several islands, or a continuum.

topo = cvx.topological_features(centroids, n_radii=20, persistence_threshold=0.05)

print(f'Archive Topology (from {len(centroids)} region centroids):')
print(f'  Significant components: {topo["n_components"]}')
print(f'  Max persistence:        {topo["max_persistence"]:.3f}')
print(f'  Mean persistence:       {topo["mean_persistence"]:.3f}')
print(f'  Persistence entropy:    {topo["persistence_entropy"]:.3f}')
print(f'  Total persistence:      {topo["total_persistence"]:.3f}')
print(f'\nInterpretation:')
print(f'  n_components ~ {N_NICHES} ground-truth niches? '
      f'Got {topo["n_components"]} (overlap and mutation blur boundaries)')
print(f'  High persistence entropy = diverse cluster lifetimes (heterogeneous niches)')

Archive Topology (from 42 region centroids):
  Significant components: 42
  Max persistence:        11.708
  Mean persistence:       9.014
  Persistence entropy:    3.682
  Total persistence:      369.565

Interpretation:
  n_components ~ 5 ground-truth niches? Got 42 (overlap and mutation blur boundaries)
  High persistence entropy = diverse cluster lifetimes (heterogeneous niches)

# Betti curve: connected components as a function of filtration radius
radii = topo['radii']
betti = topo['betti_curve']

fig = go.Figure()
fig.add_trace(go.Scatter(
    x=radii, y=betti,
    mode='lines+markers',
    marker=dict(size=6, color=C_ACCENT),
    line=dict(color=C_ACCENT, width=2.5),
    name='Betti-0 (components)',
    fill='tozeroy',
    fillcolor='rgba(46, 204, 113, 0.15)'
))

# Mark where components = N_NICHES (ground truth)
for i, b in enumerate(betti):
    if b <= N_NICHES and i > 0 and betti[i-1] > N_NICHES:
        fig.add_vline(x=radii[i], line_dash='dash', line_color=C_HIGHLIGHT,
                      annotation_text=f'{N_NICHES} niches',
                      annotation_position='top right')
        break

fig.add_hline(y=N_NICHES, line_dash='dot', line_color=C_MUTED,
              annotation_text=f'Ground truth: {N_NICHES} niches')
fig.add_hline(y=1, line_dash='dot', line_color=C_WARN,
              annotation_text='Fully connected')

fig.update_layout(
    template='plotly_dark',
    title='Betti Curve: Archive Connectivity vs Filtration Radius',
    xaxis_title='Radius',
    yaxis_title='Connected Components (Betti-0)',
    height=450
)
fig.show()

8. Path Signatures: The Shape of Evolutionary Trajectories

Path signatures from rough path theory provide a universal, order-aware feature of sequential data. Applied to region-distribution trajectories, the signature captures the shape of how exploration unfolds — not just the start and end, but the path taken.

With D=20 descriptors at depth 2 with time augmentation, the signature has K + K^2 dimensions where K = n_regions + 1.

# Compute path signatures on region-distribution trajectories
# We need entities with enough region trajectory snapshots

sig_data = []  # (entity_id, fitness, signature)
n_checked = 0
n_target = 50  # collect up to 50 signatures

for eid in range(solution_counter):
    if len(sig_data) >= n_target:
        break
    rt = index.region_trajectory(entity_id=eid, level=2, window_days=10, alpha=0.3)
    if len(rt) >= 5:
        # Prepare trajectory for signature computation
        sig_traj = [(int(t), [float(x) for x in d]) for t, d in rt]
        try:
            sig = cvx.path_signature(sig_traj, depth=2, time_augmentation=True)
            sig_data.append((eid, archive_fitness.get(eid, 0.0), sig))
        except Exception as e:
            pass
    n_checked += 1

print(f'Checked {n_checked} entities, computed {len(sig_data)} path signatures')
if sig_data:
    sig_dim = len(sig_data[0][2])
    print(f'Signature dimension: {sig_dim}')
    print(f'  (K + K^2 where K = n_region_dims + 1 for time augmentation)')

Checked 10000 entities, computed 0 path signatures

# Compare signatures: do high-fitness entities explore similarly?
if len(sig_data) >= 10:
    # Sort by fitness
    sig_data_sorted = sorted(sig_data, key=lambda x: x[1], reverse=True)

    # Top 5 vs Bottom 5 signature distances
    top_sigs = [s for _, _, s in sig_data_sorted[:5]]
    bot_sigs = [s for _, _, s in sig_data_sorted[-5:]]

    # Within-group distances
    top_dists = []
    for i in range(len(top_sigs)):
        for j in range(i+1, len(top_sigs)):
            top_dists.append(cvx.signature_distance(top_sigs[i], top_sigs[j]))

    bot_dists = []
    for i in range(len(bot_sigs)):
        for j in range(i+1, len(bot_sigs)):
            bot_dists.append(cvx.signature_distance(bot_sigs[i], bot_sigs[j]))

    # Between-group distances
    cross_dists = []
    for ts in top_sigs:
        for bs in bot_sigs:
            cross_dists.append(cvx.signature_distance(ts, bs))

    print('Signature Distance Analysis (exploration shape similarity):')
    print(f'  Top-5 within-group:    mean={np.mean(top_dists):.4f}, std={np.std(top_dists):.4f}')
    print(f'  Bottom-5 within-group: mean={np.mean(bot_dists):.4f}, std={np.std(bot_dists):.4f}')
    print(f'  Cross-group:           mean={np.mean(cross_dists):.4f}, std={np.std(cross_dists):.4f}')
    print(f'\nIf cross > within, high-fitness entities share exploration patterns')

    # PCA of signatures for visualization
    all_sigs = np.array([s for _, _, s in sig_data])
    all_fits = np.array([f for _, f, _ in sig_data])

    sig_centered = all_sigs - all_sigs.mean(axis=0)
    U_s, S_s, Vt_s = np.linalg.svd(sig_centered, full_matrices=False)
    sig_pca = sig_centered @ Vt_s[:2].T
    sig_var = (S_s[:2]**2) / (S_s**2).sum()

    fig = go.Figure()
    fig.add_trace(go.Scatter(
        x=sig_pca[:, 0], y=sig_pca[:, 1],
        mode='markers',
        marker=dict(
            size=8,
            color=all_fits,
            colorscale='RdYlGn',
            colorbar=dict(title='Fitness'),
            line=dict(width=0.5, color='white')
        ),
        text=[f'Entity {eid}<br>Fitness: {f:.3f}' for eid, f, _ in sig_data],
        hovertemplate='%{text}<extra></extra>',
        name='Signatures'
    ))
    fig.update_layout(
        template='plotly_dark',
        title=f'Path Signatures of Exploration Trajectories (PCA)<br>'
              f'<sub>{sig_var[0]:.1%} + {sig_var[1]:.1%} variance | color = fitness</sub>',
        xaxis_title=f'PC1 ({sig_var[0]:.1%})',
        yaxis_title=f'PC2 ({sig_var[1]:.1%})',
        height=500
    )
    fig.show()
else:
    print(f'Only {len(sig_data)} signatures computed -- need >= 10 for comparison.')
    print('This is expected if entities have sparse trajectories.')

Only 0 signatures computed -- need >= 10 for comparison.
This is expected if entities have sparse trajectories.

9. Putting It Together: Archive Health Dashboard

A combined view of the evolutionary process through CVX’s temporal lens.

# Multi-panel dashboard
fig = make_subplots(
    rows=2, cols=2,
    subplot_titles=(
        'Fitness Progress',
        'Archive Topology (Betti Curve)',
        'Region Population Over Time',
        'Novelty vs Fitness'
    ),
    vertical_spacing=0.12,
    horizontal_spacing=0.1
)

# Panel 1: Fitness progress
fig.add_trace(go.Scatter(
    x=list(range(N_GENERATIONS)), y=generation_best,
    mode='lines', name='Best', line=dict(color=C_HIGHLIGHT, width=2)
), row=1, col=1)
fig.add_trace(go.Scatter(
    x=list(range(N_GENERATIONS)), y=generation_mean,
    mode='lines', name='Mean', line=dict(color=C_SECONDARY, width=1.5, dash='dot')
), row=1, col=1)

# Panel 2: Betti curve
fig.add_trace(go.Scatter(
    x=radii, y=betti,
    mode='lines+markers', name='Betti-0',
    marker=dict(size=5, color=C_ACCENT),
    line=dict(color=C_ACCENT, width=2)
), row=1, col=2)

# Panel 3: Top 5 regions population over generations
top5_idx = np.argsort(region_activity)[-5:]
colors_5 = [C_HIGHLIGHT, C_SECONDARY, C_ACCENT, C_WARN, '#9b59b6']
for i, ridx in enumerate(top5_idx):
    fig.add_trace(go.Scatter(
        x=list(range(N_GENERATIONS)),
        y=gen_region_matrix[:, ridx],
        mode='lines', name=f'R{region_ids[ridx]}',
        line=dict(color=colors_5[i], width=1.5)
    ), row=2, col=1)

# Panel 4: Novelty vs Fitness scatter
if fitness_vals and novelty_scores:
    fig.add_trace(go.Scatter(
        x=fitness_vals, y=novelty_scores,
        mode='markers', name='Solutions',
        marker=dict(size=3, color=C_SECONDARY, opacity=0.4)
    ), row=2, col=2)

fig.update_layout(
    template='plotly_dark',
    title='MAP-Elites Archive Health Dashboard',
    height=700,
    showlegend=False
)

fig.update_xaxes(title_text='Generation', row=1, col=1)
fig.update_yaxes(title_text='Fitness', row=1, col=1)
fig.update_xaxes(title_text='Radius', row=1, col=2)
fig.update_yaxes(title_text='Components', row=1, col=2)
fig.update_xaxes(title_text='Generation', row=2, col=1)
fig.update_yaxes(title_text='Solutions in Region', row=2, col=1)
fig.update_xaxes(title_text='Fitness', row=2, col=2)
fig.update_yaxes(title_text='Novelty', row=2, col=2)

fig.show()

10. Summary

CVX Functions Used and What They Reveal

CVX Function	Purpose in MAP-Elites	What It Reveals
`TemporalIndex.bulk_insert()`	Build the archive	O(N log N) construction vs O(N) flat list
`TemporalIndex.search(k=15)`	Novelty scoring	O(log N) kNN — the core MAP-Elites speedup
`TemporalIndex.trajectory()`	Track solution lineages	Evolution path through descriptor space
`TemporalIndex.regions(level)`	Discover niches	Emergent behavioral niches (no pre-specified grid)
`TemporalIndex.region_members()`	Niche membership	Which solutions belong to which niche, when
`TemporalIndex.region_trajectory()`	Distribution dynamics	How niche populations shift over generations
`cvx.velocity()`	Exploration speed	How fast solutions move through descriptor space
`cvx.hurst_exponent()`	Exploration memory	Persistent (trending) vs anti-persistent (oscillating)
`cvx.wasserstein_drift()`	Distributional shift	Geometry-aware distance between niche distributions
`cvx.fisher_rao_distance()`	Statistical divergence	Information-geometric distance on distribution manifold
`cvx.topological_features()`	Archive structure	Connected components, persistence, fragmentation
`cvx.path_signature()`	Exploration shape	Universal order-aware features of trajectories
`cvx.signature_distance()`	Shape comparison	Do high-fitness entities explore similarly?

Key Takeaways

HNSW regions are natural niches. No need to pre-discretize descriptor space. The graph hierarchy provides multi-resolution niche structure for free.
O(log N) novelty scoring. The kNN search that dominates MAP-Elites runtime becomes logarithmic with HNSW, enabling archives of millions of solutions.
The dimension of change. Tracking how solutions explore (velocity, Hurst, signatures) is as informative as where they end up. CVX makes this temporal analysis native.
Distributional evolution. Wasserstein and Fisher-Rao distances on region distributions reveal global archive dynamics invisible to per-solution metrics.
Topological health. The Betti curve reveals whether the archive is fragmenting into disconnected islands or maintaining healthy coverage of descriptor space.