Quality-Diversity (MAP-Elites)
MAP-Elites is a quality-diversity (QD) algorithm that maintains an archive of diverse, high-performing solutions indexed by behavioral descriptor vectors. The central bottleneck in large-scale QD is archive operations: finding which niche a new solution belongs to and computing novelty scores. Both reduce to nearest-neighbor search over the archive.
ChronosVector replaces the linear scan with HNSW search, yielding ~1000x speedup at million-solution scale. Its graph hierarchy provides natural niche discovery without the need to pre-specify cell counts, and its temporal analytics track how exploration evolves across generations.
Problem
Section titled “Problem”Three limitations constrain MAP-Elites at scale:
-
Archive operations dominate computation. Every offspring requires a nearest-neighbor lookup against the full archive to determine its niche and compute novelty. With a linear scan, this is per evaluation — for a 1M-entry archive, this overwhelms the actual fitness evaluation.
-
CVT-MAP-Elites requires pre-specified cell counts. Centroidal Voronoi Tessellation discretizes the descriptor space into a fixed number of cells before the run begins. Choosing too few cells loses diversity; too many leaves most cells empty. The user must commit to a tessellation before seeing the data.
-
High-dimensional descriptors break Voronoi tessellation. Beyond ~20 dimensions, Voronoi cells become pathological — nearly all volume concentrates on cell boundaries, and the tessellation provides little meaningful structure.
HNSW resolves all three: search eliminates the archive bottleneck, the graph hierarchy defines niches adaptively from the data, and navigable small-world properties remain effective in high dimensions.
Data Model
Section titled “Data Model”| MAP-Elites Concept | CVX Equivalent | Notes |
|---|---|---|
| Archive | TemporalIndex | HNSW index stores all solutions |
| Behavioral descriptor | Vector | The embedding for each solution |
| Generation | Timestamp | Microsecond-precision i64; use generation number directly |
| Find niche | index.search(vector=desc, k=1) | Nearest archive member = niche assignment |
| Novelty score | index.search(vector=desc, k=15) then mean distance | Standard novelty metric (Lehman & Stanley, 2011) |
| Archive regions | index.regions(level) | HNSW hierarchy = adaptive niches |
| Niche occupant | index.region_members(region_id, level) | All solutions in a niche |
| Solution lineage | index.trajectory(entity_id) | How a lineage evolves across generations |
Archive Setup
Section titled “Archive Setup”Create a temporal index configured for the descriptor dimensionality and expected archive size:
import numpy as npimport chronos_vector as cvx
# HNSW parameters:# m=16 — 16 bidirectional links per node (good default)# ef_construction=100 — build-time beam width (higher = better recall, slower build)# ef_search=64 — query-time beam width (tunable per query)index = cvx.TemporalIndex(m=16, ef_construction=100, ef_search=64)Core Operations
Section titled “Core Operations”Batch Insert Per Generation
Section titled “Batch Insert Per Generation”After evaluating a batch of offspring, insert them all at once. The generation number serves as the timestamp:
def insert_generation(index, offspring_ids, descriptors, generation): """Insert a generation of solutions into the archive.
Args: offspring_ids: np.ndarray[uint64] of shape (N,) descriptors: np.ndarray[float32] of shape (N, D) generation: int, current generation number """ timestamps = np.full(len(offspring_ids), generation, dtype=np.int64) n_inserted = index.bulk_insert(offspring_ids, timestamps, descriptors) return n_insertedNiche Assignment
Section titled “Niche Assignment”Find the closest existing archive member — the niche this solution belongs to:
def find_niche(index, descriptor): """Return (niche_entity_id, niche_timestamp, distance) for a new solution.""" results = index.search(vector=descriptor.tolist(), k=1) if results: entity_id, timestamp, distance = results[0] return entity_id, timestamp, distance return NoneNovelty Scoring
Section titled “Novelty Scoring”Novelty is the mean distance to the nearest archive members. With HNSW this is instead of :
def novelty_score(index, descriptor, k=15): """Compute novelty as mean distance to k nearest archive members.""" results = index.search(vector=descriptor.tolist(), k=k) if not results: return float('inf') # empty archive = maximally novel distances = [distance for _, _, distance in results] return np.mean(distances)Natural Niche Discovery
Section titled “Natural Niche Discovery”HNSW’s hierarchical graph defines niches at multiple granularities without pre-specifying cell counts:
# Coarse niches (level 3): ~N/4096 regionscoarse_niches = index.regions(level=3)for region_id, centroid, member_count in coarse_niches: print(f"Niche {region_id}: {member_count} solutions, centroid dim={len(centroid)}")
# Fine niches (level 2): ~N/256 regionsfine_niches = index.regions(level=2)
# Inspect a specific nichemembers = index.region_members(region_id=coarse_niches[0][0], level=3)for entity_id, timestamp, vector in members: print(f" Solution {entity_id} from generation {timestamp}")Complete Generation Loop
Section titled “Complete Generation Loop”The following replaces the traditional MAP-Elites archive with CVX. The key change: niche assignment and novelty scoring become operations.
import numpy as npimport chronos_vector as cvx
# --- Configuration ---DIM = 50 # behavioral descriptor dimensionalityPOP_SIZE = 256 # offspring per generationMAX_GEN = 5000NOVELTY_K = 15NOVELTY_THRESHOLD = 0.1 # minimum novelty to keep
# --- Initialize ---index = cvx.TemporalIndex(m=16, ef_construction=100, ef_search=64)next_id = 0fitness_log = {} # entity_id -> best fitness
def generate_offspring(parent_ids, parent_descriptors): """Domain-specific: mutate parents to produce offspring.""" # ... mutation/crossover logic ... return offspring_ids, offspring_descriptors, offspring_fitness
# Seed populationseed_descriptors = np.random.randn(POP_SIZE, DIM).astype(np.float32)seed_ids = np.arange(POP_SIZE, dtype=np.uint64)seed_fitness = evaluate_batch(seed_descriptors) # domain-specificindex.bulk_insert(seed_ids, np.zeros(POP_SIZE, dtype=np.int64), seed_descriptors)for sid, fit in zip(seed_ids, seed_fitness): fitness_log[int(sid)] = fitnext_id = POP_SIZE
# --- Main loop ---for generation in range(1, MAX_GEN + 1): # 1. Generate offspring parent_ids = np.random.choice(list(fitness_log.keys()), size=POP_SIZE) offspring_desc = mutate(parent_ids) # -> np.ndarray[float32] (POP_SIZE, DIM) offspring_fit = evaluate_batch(offspring_desc)
# 2. Assign IDs offspring_ids = np.arange(next_id, next_id + POP_SIZE, dtype=np.uint64) next_id += POP_SIZE
# 3. Batch insert with generation as timestamp timestamps = np.full(POP_SIZE, generation, dtype=np.int64) index.bulk_insert(offspring_ids, timestamps, offspring_desc)
# 4. Novelty scoring in O(log N) for i in range(POP_SIZE): desc = offspring_desc[i].tolist() neighbors = index.search(vector=desc, k=NOVELTY_K) novelty = np.mean([dist for _, _, dist in neighbors])
if novelty > NOVELTY_THRESHOLD: fitness_log[int(offspring_ids[i])] = offspring_fit[i]
# 5. Periodic analytics (every 100 generations) if generation % 100 == 0: regions = index.regions(level=2) print(f"Gen {generation}: archive={len(index)}, niches={len(regions)}")Archive Analytics
Section titled “Archive Analytics”CVX’s temporal analytics provide insight into the evolutionary process itself, not just the solutions.
Solution Lineage Analysis
Section titled “Solution Lineage Analysis”Track how a lineage evolves through descriptor space across generations using path signatures:
# Retrieve a solution's full trajectory across generationstraj = index.trajectory(entity_id=42)# traj = [(gen0, vec0), (gen1, vec1), ...]
# Path signature captures the shape of the evolutionary pathsig = cvx.path_signature(traj, depth=2)
# Compare two lineagessig_a = cvx.path_signature(index.trajectory(entity_id=42), depth=2)sig_b = cvx.path_signature(index.trajectory(entity_id=99), depth=2)similarity = cvx.signature_distance(sig_a, sig_b)# Low distance = similar evolutionary dynamicsExploration Dynamics
Section titled “Exploration Dynamics”Track how the search distributes across niches over time:
# Region trajectory: how an entity's niche membership shifts over generationsregion_traj = index.region_trajectory(entity_id=42, level=3, window_days=100, alpha=0.3)# Returns distribution over regions at each time window
# Wasserstein drift between early and late explorationearly_dist = region_traj[0] # niche distribution at startlate_dist = region_traj[-1] # niche distribution now
centroids = [centroid for _, centroid, _ in index.regions(level=3)]w_drift = cvx.wasserstein_drift(early_dist, late_dist, centroids)# High drift = exploration shifted to new regions of descriptor spaceArchive Topology
Section titled “Archive Topology”Monitor whether the archive is well-connected or fragmenting into isolated clusters:
# Extract region centroidscentroids = [centroid for _, centroid, _ in index.regions(level=2)]
# Topological features via persistent homologytopo = cvx.topological_features(centroids)print(f"Connected components: {topo['n_components']}")print(f"Max persistence: {topo['max_persistence']:.3f}")print(f"Persistence entropy: {topo['persistence_entropy']:.3f}")
# Interpretation:# n_components = 1 → archive is well-connected# n_components >> 1 → fragmented into isolated niches# High persistence → strong cluster separation# High entropy → uniform cluster structureConvergence and Stagnation Detection
Section titled “Convergence and Stagnation Detection”Detect when the evolutionary search is losing steam:
# Hurst exponent on a solution lineagetraj = index.trajectory(entity_id=42)h = cvx.hurst_exponent(traj)# H > 0.5 → persistent (trending exploration in one direction)# H ≈ 0.5 → random walk (search becoming undirected — convergence)# H < 0.5 → anti-persistent (oscillating, possibly stuck)
# Velocity: how fast is exploration moving?vel = cvx.velocity(traj, timestamp=generation)speed = np.linalg.norm(vel)# Declining speed across generations → stagnation# Sudden spike → discovered a new promising regionPerformance
Section titled “Performance”The asymptotic improvement from HNSW is dramatic at scale:
| Archive Size | Linear Scan (per query) | HNSW (per query) | Speedup |
|---|---|---|---|
| 1,000 | 1,000 ops | ~10 ops | 100x |
| 100,000 | 100,000 ops | ~17 ops | ~6,000x |
| 1,000,000 | 1,000,000 ops | ~20 ops | ~50,000x |
50D descriptors, k=15, ef_search=64. HNSW query time scales as .
For a typical QD run with 1M archive entries and 256 offspring per generation, the per-generation archive operations drop from ~256M distance computations (linear) to ~5,120 (HNSW), making novelty scoring negligible compared to fitness evaluation.
Why Not CVT-MAP-Elites?
Section titled “Why Not CVT-MAP-Elites?”CVT-MAP-Elites (Vassiliades et al., 2018) pre-computes a Centroidal Voronoi Tessellation to discretize the descriptor space. This works well in low dimensions but has structural limitations:
| Aspect | CVT-MAP-Elites | CVX (HNSW) |
|---|---|---|
| Cell count | Fixed before run | Adaptive (grows with archive) |
| High dimensions | Voronoi degenerates >20D | HNSW effective to 1000D+ |
| Niche granularity | Single scale | Multi-scale via graph levels |
| Temporal tracking | Not built in | Native (generation = timestamp) |
| Analytics | External tooling | Integrated (signatures, topology, drift) |
CVX’s HNSW hierarchy provides the niche structure that CVT pre-computes, but discovers it adaptively from the actual solutions rather than from a prior tessellation of the space.
References
Section titled “References”- Mouret, J.-B. & Clune, J. (2015). Illuminating search spaces by mapping elites. arXiv:1504.04909.
- Vassiliades, V., Chatzilygeroudis, K., & Mouret, J.-B. (2018). Using centroidal Voronoi tessellations to scale up the multidimensional archive of phenotypic elites algorithm. IEEE Transactions on Evolutionary Computation, 22(4), 623-630.
- Fontaine, M.C., Togelius, J., Nikolaidis, S., & Hoover, A.K. (2020). Covariance matrix adaptation for the rapid illumination of behavior space. GECCO.
- Lehman, J. & Stanley, K.O. (2011). Abandoning objectives: Evolution through the search for novelty alone. Evolutionary Computation, 19(2), 189-223.
- Malkov, Y.A. & Yashunin, D.A. (2018). Efficient and robust approximate nearest neighbor using hierarchical navigable small world graphs. IEEE TPAMI, 42(4), 824-836.