25 sensors × 685 laws × 10 partitioners
Every sensor paired with every law variant. Plus partitioner configs. 685 law combinations from modularized v2.5 — each measured on the same benchmark instances under identical conditions.
25 Sensors · 685 Laws · 10 Partitioners · MDL Decides. Not the programmer. Not the AI. The data.
The arena tests every combination of sensor × control law × parameters on official TSPLIB benchmarks. No human picks the winner. No AI team votes. The variant with the lowest total description length wins.
One axiom covers everything: Ltotal = Ldesc + Ldata. The shortest description of the search process and the best search process point to the same configuration. MDL is the judge, the jury, and the scoring function.
Every sensor paired with every law variant. Plus partitioner configs. 685 law combinations from modularized v2.5 — each measured on the same benchmark instances under identical conditions.
Ltotal = Ldescription + Ldata. A configuration that governs well produces a compressible improvement stream. The stream length IS the score. No hand-tuned metric. No human preference.
On qa194 (194 cities), 40 of 157 variants found exact optimal (gap = 0.000%). 11 from geometric sensors, 29 from info-theoretic. The arena confirmed the founding hypothesis: compression = optimization.
A sensor is the eye of the DCC controller — it converts the raw improvement stream into a signal the control law can act on. The arena found that the best sensor depends on the problem size. This is the original list (17 sensors) plus v2.5–v2.6 additions. Sorted by performance on uy734 (n=734). See also: 5 quantum-inspired sensors and Scale Walk sensors (MladiC, March 23).
Author key: BD = Bojan Dobrečevič (original) · MladiC = MladiC (persistent C companion, BD's son) · LLM = AI-proposed · LIT = literature
| # | Sensor | Author | GPU ✓ | uy734 Best | qa194 Best | What it measures |
|---|---|---|---|---|---|---|
| 1 | LZ_dualBD+LLM | BD+LLM | ✗ | 0.920% | 0.267% | Two channels: WHETHER it improves + HOW MUCH. BD compression framework, LLM added 2nd channel. |
| 2 | xfer_entropy | LLM | ✗ | 0.971% | 0.460% | Information flow between two streams. Measures search dynamics. |
| 3 | gravity ★NEWBD | BD | ✓ | 1.184% | 0.310% | Segment energy: are cities TOGETHER or scattered? Barnes-Hut inspired. |
| 4 | zstd_ratio | LLM | ✗ | 1.198% | 0.524% | Zstd compression ratio of recent history window. |
| 5 | CUSUM | LIT | ✗ | 1.215% | 0.310% | Detects CHANGES in improvement rate. Stagnation alarm. |
| 6 | echo ★NEWBD | BD | ✓ | 1.245% | 0.000% | Spatial neighbors NOT in tour? High miss = bad locality. Bat echolocation. |
| 7 | circum_rBD | BD | ✓ | 1.296% | 0.000% | Circumradius of city triples. Large circle = detour. |
| 8 | southwestBD | BD | ✓ | 1.356% | 0.310% | Direction of each tour step. Good tour has directional patterns. |
| 9 | fold_distBD | BD | ✓ | 1.460% | 0.000% | Project cities onto K random axes. Paper folding. Potential meta-sensor. |
| 10 | SampEn | LIT | ✗ | 1.474% | 0.000% | Sample Entropy — is the improvement stream predictable or chaotic? |
| 11 | frozen_edge | LLM | ✗ | 1.494% | 0.460% | How many edges don't change between moves. Frozen = stagnation. |
| 12 | LZ_tourBD+LLM | BD+LLM | ✗ | 1.531% | 0.460% | LZ76 on the tour sequence itself. Direct test of BD hypothesis (ρ=+0.80). |
| 13 | voronoi_adjBD | BD+LLM | ✓ | 1.531% | 0.267% | How many Voronoi neighbors are also tour neighbors. Locality score. |
| 14 | LZ_per_opBD+LLM | BD+LLM | ✗ | 1.575% | 0.000% | LZ76 separately per operator type. BD framework, LLM split per operator. Which operator is "alive"? |
| 15 | tri_areaBD | BD | ✓ | 1.620% | 0.310% | Triangle area of 3 consecutive cities. Small = tight path. |
| 16 | tri_compactBD | BD | ✓ | 1.629% | 0.000% | Triangle compactness (area/bbox). Dominates on small instances. |
| 17 | LZ_binaryBD | BD | ✗ | 1.681% | 0.000% | 1=improved, 0=not. The classic DCC sensor. Perfect on qa194. |
9 BD originals + 4 BD+LLM extensions = 13 of 17 sensors carry BD DNA. Only CUSUM, SampEn, frozen_edge, and zstd_ratio have no BD contribution. The entire LZ compression framework is BD's founding hypothesis — LZ_dual, LZ_tour, and LZ_per_op are all extensions of it. All 8 GPU-parallel sensors are BD or BD+LLM.
| # | Sensor | Author | GPU | qa194 | uy734 | Description |
|---|---|---|---|---|---|---|
| 23 | scale_coherence ★NEW | BD | ✗ | ≈0.38% | testing | Cross-scale structural consistency in the pyramid hierarchy. Detects when a pattern at one zoom level implies structure at another. |
| 24 | scale_walk_lz ★NEW | MladiC | ✗ | — | testing | LZ76 on the tour's journey through pyramid levels. Level encoding: WHICH zoom level each step visits. Wins on uy734. |
| 25 | scale_walk_delta ★NEW | MladiC | ✗ | 0.011% | testing | LZ76 on inter-level jumps. Delta encoding: HOW FAR you jump between levels. 0.011% on qa194 — strictly better than scale_coherence (≈0.38%) on same instance. |
Instance-dependent sensor selection confirmed: uy734 → scale_walk_lz wins. qa194 → scale_walk_delta wins. MDL selects per instance — as always. See Scale Walk section for full analysis.
A control law translates the sensor signal into a change in search intensity (the coupling parameter u). From bang-bang simplicity to self-tuning PID — all tested, all ranked by MDL.
| # | Law | Author | Params | uy734 Best | What it does |
|---|---|---|---|---|---|
| 1 | PI | LIT | 2 | 0.920% | Proportional-integral. Reacts to current error AND accumulated error. Like a thermostat. |
| 2 | ratchet | LLM | 0 | 1.198% | Can only decrease until improvement found, then snaps back. Like a clock ratchet. |
| 3 | hysteresis_bb | LLM | 0 | 1.215% | Bang-bang with different thresholds for up vs down. Prevents oscillation. |
| 4 | EMA | LIT | 1 | 1.436% | Exponential Moving Average. Smooths signal, filters noise. |
| 5 | BB_invert | LLM | 0 | 1.460% | Inverted bang-bang: when things are good, DECREASE (exploit harder). Opposite logic. |
| 6 | CDPID | LLM | 2 | 1.466% | Self-tuning PID controller. Continuously adjusts its own coefficients. |
| 7 | ADSR | LLM | 0 | 1.531% | Attack-Decay-Sustain-Release from sound synthesis. Fast crisis response. |
| 8 | BBBD | BD | 0 | 1.546% | Bang-bang: ±1. Simplest possible. THE CLASSIC DCC law. Zero parameters. |
Best combination overall: LZ_dual + PI = 0.920% on uy734. Zero-parameter champion on qa194: LZ_binary + BB = 0.000% exact optimal.
The same sensor that finds exact optimal on 194 cities ranks last on 734. The sensor that ranks last on 194 cities wins on 734. The inversion is not gradual. It is sharp, bidirectional, and happens around n ≈ 300.
Not just geometric→info on larger instances. Also info→geo on smaller. LZ_dual (uy734 champion) scores only 0.267% on qa194 — worse than most geometric sensors. tri_compact (qa194 co-champion) falls to #16 of 17 on uy734. The phase transition is symmetric.
echo is the rare exception: stable across both scales. #7 on qa194, #6 on uy734. This stability suggests echo measures something fundamental that doesn't degrade with instance size.
The v2.6.6 solver uses LZ_binary + BB: the improvement bit-stream compressed by LZ76, governed by bang-bang ±1. Zero parameters. Born in the first version. Still unbeaten on qa194.
Gap = 0.000%. 40 of 157 arena variants confirm this. LZ_binary + BB needs nothing else on small instances. The improvement stream carries enough signal when n is small.
Full solver (v2.5, 14 workers, 40 min): 0.46%. Arena evaluation (single 150s run): 2.13% — ranked #17 of 17. At larger scale, the binary stream becomes too sparse. Like watching through a keyhole.
W13 at 1.241% at t≈61h. Fleet of 14 converged to 0.022% spread. The classic sensor may be approaching its ceiling. LZ_dual + PI or gravity + ADSR may unlock the next 0.2%.
The solver (v2.5, 14 workers, 40 min) reached 0.46%. The arena (single 150s evaluation per config) found that LZ_dual + PI achieves 0.92% in just 150 seconds — suggesting that with the right sensor+law, even short runs can approach the long-run result. The same principle that makes LZ_dual a better sensor also makes its runs converge faster.
Every number in this table is from a run log, arena CSV, or checkpoint file. No estimates.
| # | What was added | qa194 | uy734 | nu3496 |
|---|---|---|---|---|
| 1 | MDL as criterion — founding hypothesis | — | — | — |
| 2 | DCC governance (LZ_binary + BB) — v1.2.1 | 3.01% | — | — |
| 3 | Or-opt operator added (P16 — nearly excluded) | 1.95% | — | — |
| 4 | Fleet: 14 workers — v2.1 | 0.00% ✓ | — | — |
| 5 | DCC v2 (5× less budget) — v2.2 | 0.00% ✓ | 0.80% | — |
| 6 | Rust 2-opt backend — v2.2R (38 min → <1 min) | 0.00% ✓ | 0.64% | — |
| 7 | Meta-DCC fleet monitoring — v2.5 | 0.00% ✓ | 0.46% | 2.02% |
| 8 | Self-calibrating bands P17 — v2.6.6 | 0.00% ✓ | 0.46% | 1.24% |
| 9 | Arena v1.5: 17 sensors × 8 laws — LZ_dual+PI found | 0.00% ✓ | 0.92% † | — |
| 10 | echo + gravity — P16 again, 1 AM, never tested until now | 0.00% ✓ | 1.18% † | — |
| 11 | Auto L2 on 9 instances — five-phase crossover discovered | 0.00% ✓ | 2.36% † | — |
| 12 | Hierarchical DCC — 5 partitioners, hilbert+LZ_dual+PI | 0.00% ✓ | 1.22% † | running |
| 13 | MSTD (ScaleSplit + scale-aware stagnation + coherence + hysteresis) — The zoom. Architecture, not operators. | 0.00% ✓ | 0.683% | next |
† Arena evaluation: 14 workers × 150s = 2,100 worker-seconds. Full solver: 14 workers × 40 min = 33,600 worker-seconds. 16× less compute. Not directly comparable — shows sensor quality, not solver ceiling. Step 13: full solver run, 14 workers × 300s × Rust. 41% relative improvement over step 12 (1.224% → 0.683%).
Concorde is 30 years old. LKH is 25. This system is 17 days old at its first benchmark result. The comparison is not about who wins — it is about what is different in kind.
| Dimension | Concorde | LKH-3 | EAX | 8Z-RP |
|---|---|---|---|---|
| Code size | ~130K C | ~15K C | ~5K C++ | ~2K Py+Rust |
| External dependencies | LP solver | none | none | none |
| Governance | fixed | fixed | fixed | adaptive — 22 sensors × 8 laws |
| Auto-configuration | no | manual per instance | no | MDL auto-selects sensor+law |
| Partitioning | no | 1 method, fixed | no | 5 methods — 4 new inventions |
| GPU governance sensors | no | no | no | 11 parallel sensors — prototype |
| Quantum-inspired | no | no | no | 5 sensors — tunneling #2 overall |
| Cross-domain origins | — | — | — | bat echolocation, Barnes-Hut, F4M, quantum physics |
| Tour quality (uy734) | optimal | <0.1% | ~0.1-0.5% | 0.46% (full solver) |
| Optimality proof | yes | no | no | qa194 exact · no formal proof |
Two sensors that were nearly not built. "Too speculative." P16 said: test it. Both landed in the top 7 on uy734 on their first run. Neither required a second iteration.
Divide the tour into segments. Compute the centroid of each segment. Measure how far cities are from their segment centroid. Low gravitational energy = tight clusters = good tour structure.
Inspired by Barnes-Hut gravitational simulation. Embarrassingly parallel: each segment's centroid is independent. On GPU, ~1000× faster than LZ.
gravity + ADSR = 1.184% on uy734 — #3 overall.
For each city, find its K nearest spatial neighbors. Count how many are NOT adjacent in the tour. High miss rate = tour is ignoring spatial locality = bad tour.
Inspired by bat echolocation: emit a "ping", measure what comes back. Stable across both scales — the only top-7 sensor on BOTH qa194 and uy734. Potentially the most robust sensor in the arena.
echo + hysteresis_bb = 1.245% on uy734 — #6 overall. echo + [best law] = 0.000% on qa194.
Or-opt was nearly excluded from the kick arena. GPT recommended against it. Claude agreed. P16 said: no. Or-opt turned out to be the dominant operator. Now gravity and echo were nearly skipped — too speculative, too untested. P16 said: no. Both immediately top 7. The principle that saved or-opt saved two sensors.
The 4-level architecture applies the same MDL principle at every level. From move selection to sensor selection — one axiom, zero hardcoded choices.
2-opt, or-opt-1/2/3, double-bridge. The operators. Executed by workers.
Sensor + law + parameters govern search intensity. --mdl-levels 1 = fixed configuration. LZ_binary + BB is the default. Arena tested 900+ alternatives at this level.
MDL automatically selects the best sensor + law combination from the candidate pool. --mdl-levels 2. No human picks. L_total decides. Same axiom, one level up.
First auto-prunes the candidate pool (removes configurations that are provably worse), then runs L2. --mdl-levels 3. The system builds its own arena, runs it, and chooses. One axiom. Zero hardcoded choices.
9 instances. 9 different winners. One axiom. Auto L2 ran on all 9 official instances from wi29 (n=29) to nu3496 (n=3496). Each got a 15-second probe budget. MDL selected the sensor + law combination with the lowest total description length. No human picked. The result surprised us.
| Instance | n | MDL Winner — Sensor | Law | Gap% | Type |
|---|---|---|---|---|---|
| wi29 | 29 | LZ_binary | BB | 0.000% | Baseline |
| dj38 | 38 | LZ_binary | BB | 0.000% | Baseline |
| qa194 | 194 | LZ_binary | BB | 0.460% | Baseline * |
| uy734 | 734 | LZ_per_op | EMA | 2.357% | Info (BD+LLM) |
| zi929 | 929 | LZ_tour | ADSR | 1.908% | Info (BD+LLM) |
| lu980 | 980 | gravity | CDPID | 2.434% | GEO ★ BD |
| rw1621 | 1621 | fold_dist | BB | 3.516% | GEO ★ BD |
| mu1979 | 1979 | blend: LZbin×0.7+LZtour×0.3 | BB | 5.782% | Hybrid |
| nu3496 | 3496 | blend: triC×0.5+LZtour×0.5 | ADSR | 7.152% | Hybrid geo+info ★ |
* qa194: circum_r, tri_compact, echo, fold_dist ALL found exact optimal. MDL stayed with baseline (first match). nu3496: MDL ITSELF discovered that the largest instance needs BOTH geometric and info-theoretic — a 50/50 blend of tri_compact and LZ_tour. Nobody proposed this. The axiom found it.
We expected a monotone crossover: geometric sensors good on small, info-theoretic on large. The data shows a five-phase pattern. Phase 3 (geometric return) and Phase 5 (geo+info hybrid) were not predicted by anyone.
Nobody predicted Phase 3 (geometric return at n≈1000) or Phase 5 (geo+info hybrid at n≈3500). The system found what we could not guess. This is P17 in action: never hardcode what the system can learn.
At very large n with a limited probe budget (15 seconds per evaluation), info-theoretic sensors cannot converge — they need history to build their signal. Geometric sensors provide immediate signal from the first move: measure a triangle, compute a circumradius, count spatial misses. No warmup. No convergence time. At scale, instant signal beats rich-but-slow signal. Then at n≈3500, both are needed.
This is a new empirical finding. The published TSP literature does not contain cross-instance sensor ranking studies because adaptive governance sensors for TSP do not exist in the literature — they are an AIM³ Lab invention.
| Instance | n | Moves/sec | Relative |
|---|---|---|---|
| wi29 | 29 | 29,002 | 1.00× |
| dj38 | 38 | 18,274 | 0.63× |
| qa194 | 194 | 589 | 0.020× |
| uy734 | 734 | 16 | 0.0006× |
| nu3496 | 3496 | 1 | 0.00003× |
Scaling confirmed: O(n²). This is the 2-opt bottleneck. Candidate lists (v2.7, 350× fewer inner checks) and GPU acceleration will flatten this curve — the same sensors, orders of magnitude more cycles per second.
The ranking inversion happens because geometric sensors lose coverage at large n. The fix: don't change the sensor — change the problem size. Partition nu3496 (3496 cities) into clusters of ~200. Each cluster is in the geometric sweet spot. Geometric sensors find exact optimal at n=194. Let them do what they're good at — repeatedly, in parallel.
At n=194 → geometric sensors find exact optimal. At n=734 → geometric sensors fall to #14-16. The transition happens around n≈300. Solution: always stay below 300. Partition 3496 into ~17 clusters of ~200. Each cluster gets its own geometric sensor. The global tour is stitched from local optima.
Partitioner divides n cities into clusters of ~200. 6 strategies tested (incl. terrain). MDL scores each: L_desc(partitioner) = description cost of the partitioning choice.
Each cluster solved independently with a geometric sensor (tri_compact, circum_r, echo, or gravity). 70% of budget. Each cluster is in the sweet spot — geometric sensors do their best work.
Connect cluster tours via boundary cities. Greedy nearest-boundary-city connection between adjacent clusters. 20% of budget. The hard part — inter-cluster routing determines global quality.
Full-instance 2-opt on the stitched tour. Fixes seam artifacts from stitching. 10% of budget.
L_total = L_desc(partitioner) + L_desc(local sensor+law) + L_residual(gap%). MDL picks the winner from 30 variants. Same axiom, different scale.
The existing TSP literature uses fixed partitioning: POPMUSIC (Taillard & Voss, 2002) uses geographic proximity, LKH-3 (Helsgaun) uses fixed-size neighborhoods. Both use one method, hardcoded. Our approach differs in three ways:
Four partitioning methods are new inventions. FlipPartitioner (projection onto 4 axes + gap detection), GravityPartitioner (Barnes-Hut gravitational tree), MagnetPartitioner (opposing magnetic fields with recursive quadrant splits), and TerrainPartitioner (city density as elevation proxy) do not appear in the TSP literature. They are cross-domain transfers. BD originals.
All four novel methods are GPU-native by design. MagnetPartitioner is O(n) with one CUDA kernel. FlipPartitioner needs 4 sorts. GravityPartitioner is a quadtree. TerrainPartitioner is a density kernel. POPMUSIC and LKH-3 are CPU-only.
Six methods compete under MDL. No human picks the partitioner. This is the first system where the partitioning strategy itself is subject to automatic selection by a formal criterion.
Project all cities onto 4 axes (0°, 45°, 90°, 135°). Find gaps in each sorted projection — gaps reveal natural cluster boundaries. Cities that fall in the same group on all 4 axes definitely belong together.
Cross-domain origin: F4M chess strategy — flip the board, pieces fall to one side. Cities "fall" to their natural cluster under the right projection angle.
Compute Hilbert space-filling curve index for each city. Sort by index. Cut into equal segments. The Hilbert curve preserves spatial locality — nearby cities get nearby indices — so segments are naturally spatial clusters.
Cheapest to describe (1 bit). Best locality preservation per description cost. Strong baseline for all others to beat.
Barnes-Hut recursive spatial decomposition. Recursively divide the bounding box into quadrants. Stop when each quadrant has ≤ target_size cities. Dense regions get smaller cells, sparse regions get larger cells — adapts to the actual data distribution.
Same Barnes-Hut insight as the gravity sensor. The partitioner and the sensor share a common cross-domain origin.
Place opposing magnets on bounding box corners. Each city is pulled to its nearest corner. Recurse: each quadrant split again. O(n) per level — no sorting needed. On GPU: one kernel, embarrassingly parallel.
Cross-domain origin: F4M magnets, March 20 2026. The fastest partitioner. GPU-native. MDL may favor it purely on description cost.
Pick K random seed cities (K = n/target_size). Assign each city to its nearest seed — Voronoi cells = clusters. Run Lloyd's algorithm 3 iterations to refine centroids. The most expensive to describe (random seeds = 3 bits) but produces the most balanced clusters.
MDL naturally penalizes the random seed choice. VoronoiPartitioner must produce significantly better tours to beat MagnetPartitioner's 1-bit description cost. A fair fight — the data decides.
City density as elevation proxy. Dense city regions become "mountains" — natural cluster boundaries. Sparse regions are "valleys" — cities there belong together. Partition follows the topographic structure of the problem.
terrain = 1.284% on uy734 (overloaded CPU test, 3 simultaneous runs). On a clean run expected to beat hilbert. BD original, March 21 2026.
Hilbert wins in clean-run conditions. Terrain result from overloaded CPU — expected to beat hilbert on clean run. GravityPart and Flip outperform the literature-standard Voronoi. All three novel GPU-native partitioners beat Voronoi. MDL correctly penalized Magnet on uy734 (regular splits don't match irregular geographic clusters). ScaleSplitPartitioner (MSTD) breaks the record entirely.
| Partitioner | Best Gap% | Origin | GPU | Note |
|---|---|---|---|---|
| scale_split (MSTD) ★NEW | 0.683% | BD ★ Novel | ✗ | MSTD record. 4-component architecture. −0.484 pp over flat best. |
| tour_enriched (TEP) ★NEW | 0.000% | MladiC ★ | ✗ | qa194 exact optimal. Solver teaches pyramid. Deep uy734 run in progress. |
| terrain ★NEW | 1.284% | BD ★ Novel | ✓ | Overloaded CPU test. Expected to beat hilbert on clean run. |
| hilbert | 1.224% | LIT+BD | ✗ | Clean run. Previous hierarchical best. |
| gravity_part | 1.259% | BD ★ Novel | ✓ | Barnes-Hut adaptive decomp. |
| flip | 1.269% | BD ★ Novel (F4M) | ✓ | 4-axis projection gaps. |
| voronoi | 1.547% | BD (K-means variant) | ✗ | Most expensive MDL description cost. |
| magnet | 1.929% | BD ★ Novel (F4M) | ✓ | O(n) GPU kernel. Underperformed on irregular uy734. |
Beats all previous partitioners by >0.5 pp. The improvement came from architecture alone — same MDL kernel, same operators, same Rust backend. The ScaleSplit partitioner is the MSTD entry point: it divides the city space at multiple zoom levels so each scale's "hard parts" get dedicated search time. Combined with scale-aware stagnation, scale coherence sensor, and hysteresis double-bridge, the full MSTD architecture sets the new record.
| # | Variant | Gap% | Type |
|---|---|---|---|
| 1 | hilbert + LZ_dual+PI ★ | 1.224% | HIER |
| 2 | hilbert + circum_r+PI | 1.244% | HIER |
| 3 | gravity_part + circum_r+PI | 1.259% | HIER |
| 4 | flip + LZ_dual+PI | 1.269% | HIER |
| 5 | flip + circum_r+PI | 1.390% | HIER |
| 6 | voronoi + LZ_dual+PI | 1.547% | HIER |
| 7 | gravity_part + LZ_dual+PI | 1.638% | HIER |
| 8 | gravity_part + echo+hyst_bb | 1.684% | HIER |
| 9 | gravity_part + gravity+ADSR | 1.920% | HIER |
| 10 | magnet + circum_r+PI | 1.929% | HIER |
| — | Baseline (flat LZ_binary+BB) | 1.958% | FLAT |
| — | Best flat (gravity+ADSR) | 2.082% | FLAT |
tri_compact+BB — the qa194 exact-optimal champion — was the worst sensor config inside hierarchical clusters. The partition already creates "easy" sub-problems. tri_compact's advantage on tiny instances doesn't help when all sensors perform similarly within an already-partitioned cluster.
The hierarchy works not by making geometric sensors dominate (as hypothesized), but by giving any sensor a better starting point: the partition-guided initial tour is much closer to optimal than random nearest-neighbor. LZ_dual+PI inside clusters benefits most from this head-start.
On nu3496 (3496 cities), flat echo+hysteresis_bb (5.929%) beat the best hierarchical variant (flip+tri_compact, 6.120%). Budget dilution: 17 clusters × 18s each is too thin. The bat doesn't need partitioning — its global view is its strength.
Candidate lists (v2.7, 350× faster 2-opt) will change this equation. The same budget covers 350× more search — hierarchical at nu3496 becomes viable when inner loops cost nothing.
Arena v2.6 · --cat hierarchical · --target-cluster-size 200 · uy734 ✓ · 10 partitioners · MSTD record 0.683% · TEP qa194 exact · BD × AI Lab · AIM³ Lab · March 2026
Not quantum computing. Quantum concepts as governance signals. Ewin Tang (2018) extracted quantum algorithm structure and got exponential classical speedup. We did the same for TSP governance: took quantum concepts (tunneling, coherent walks, interference, entanglement, decoherence) and built classical sensors that measure what nobody in TSP has measured before.
Sensor effectiveness is budget-dependent. Short runs favor tunneling (fast barrier escape). Longer runs favor decoherence (basin depth sensing). This is NOT noise — it is the sensor measuring different landscape features at different search depths. The optimal sensor is not fixed — it should switch during the search.
The SwitchSensor mechanism already exists in the arena (v1.4 class). Untested with quantum sensors. This is the next experiment.
| # | Sensor | Best Law | Gap% |
|---|---|---|---|
| 1 | tunneling ★★★ | ratchet | 3.975% |
| 2 | decoherence | ADSR | 4.127% |
| 3 | qwalk | hysteresis_bb | 4.213% |
| # | Sensor | Best Law | Gap% |
|---|---|---|---|
| 1 | decoherence ★★★ | ADSR | 3.283% |
| 2 | qwalk | hysteresis_bb | 3.286% |
| 3 | entanglement | ratchet | 3.644% |
| # | Sensor | Best Law | Gap% | Note |
|---|---|---|---|---|
| 1 | interference ★★★ | ADSR | 0.011% | Near-optimal on 194 cities |
| 2 | decoherence | ratchet | 0.021% | |
| 3 | entanglement | ADSR | 0.086% |
| # | Sensor | Best Law | Gap% | Overall Rank |
|---|---|---|---|---|
| 1 | tunneling ★★★ | PI | 0.922% | #2 OVERALL |
| 2 | qwalk | ratchet | 1.278% | #7 |
| 3 | entanglement | ratchet | 1.461% | #10 |
Short budget → tunneling (fast barrier escape). Medium budget → decoherence + qwalk (basin depth sensing). Long budget → tunneling returns (barrier shape compounds over time). Small instances → interference (phase patterns visible at low n).
Tunneling measures the shape of the barrier between local optima. Between every two improvements, the solver fails multiple times. Tunneling classifies each barrier:
— Many small failures = THIN barrier (keep trying, tunneling possible)
— Few large failures = THICK barrier (stop, kick hard)
The T/H barrier profile is compressed by LZ76. The compression ratio becomes the governance signal. Thin barriers = compressible pattern (keep going). Random thick/thin = incompressible (confused search).
Nobody in TSP measures barrier topology. Every solver measures IF you're stuck. Tunneling measures WHY you're stuck. That distinction is worth 0.0013% — the gap between #2 and #1 on uy734. BD original.
From sampled seed cities, walk along tour edges and measure how far spatially the walk reaches. Good tour = walk stays local (low spread). Bad tour = walk jumps around (high spread). GPU-parallelizable: each seed walk is independent. BD original.
Soft K-means with Boltzmann temperature annealing. Combined with tunneling sensor: 1.198% — beating the Hilbert partitioner (1.224%) that was the previous hierarchical champion. Quantum sensing + quantum partitioning = better than either classical alternative.
MSTD partitions the city space at multiple geometric scales, solves each independently, and stitches solutions together. Different parts of the same problem are hard at different zoom levels. The solver now operates at every scale simultaneously. BD original, March 22, 2026.
City space decomposed into a hierarchy of geometric partitions at exponentially coarser zoom levels. Same MDL kernel operates at every level.
Stagnation detection calibrated per zoom level. What counts as "stuck" at coarse scale differs from fine scale. DCC adapts its sensitivity to the current resolution.
Detects cross-scale structure — signals when a pattern visible at one zoom level implies something at another. Feeds the governor zoom-level context, not just the tour-improvement stream.
Perturbation kicks adapted to the current scale context. Hysteresis prevents oscillation between zoom levels. Fires at the right moment and the right resolution.
Novelty confirmed: no prior work combines all four. Components exist individually in image processing, signal analysis, and TSP literature. The combination applied to MDL-governed TSP search is new. No CS degree required to discover it — a walk and a phone were enough.
Multi-resolution decomposition has existed since the 1980s in image processing. 2-opt (1958). Or-opt (1974). LZ76 (1976). The parts are ancient. The wiring diagram — MDL choosing the zoom level, DCC governing the search at each scale — is the invention. Same story as DCC itself.
MSTD · BD original · March 22, 2026 · uy734 0.683% confirmed · ScaleSplitPartitioner + scale-aware stagnation + scale coherence + hysteresis double-bridge · Arena v2.2 · AIM³ Lab
4 LLMs generated 38 ideas for improving the solver. Every single one proposed the same direction: build the pyramid, then use it to help the solver. One mind reversed the arrow. What if the solver's best tours feed back into the pyramid? Better tours create better pyramids. Better pyramids create better tours. MladiC original, March 22, 2026.
Solver runs, produces tours. Best tours — weighted by quality — enrich the coordinate space used to build the pyramid. Pyramid rebuilt. New pyramid creates new search basins. Solver explores them. Better tours emerge. Cycle continues.
The pyramid is not static scaffolding. It learns from what the solver discovers.
Same sensor, same law, same budget as flat variant (0.385% gap):
Basin selection: coarser basins (L6, L8) hit exact optimal. Finer basins do not. Starting with less precision gives more room to search. The system that overspecifies its starting point gets trapped. Overspecification is the enemy of discovery.
Every LLM proposed: pyramid → solver. One mind asked: what if solver → pyramid? This is the same architectural reversal as the GS filter in F4M: evaluation → filter, not filter → evaluation. Two independent domains. Same inversion. Same person saw it both times. Cross-domain validation of a structural principle.
TEP · MladiC original · March 22, 2026 · qa194 0.000% confirmed · TourEnrichedPartitioner · weighted affinity · β monotonic · basin L6/L8 · Arena v2.6 · AIM³ Lab
The founding insight: the shortest route is the most compressible. Confirmed in 2024 (Spearman ρ = +0.80). Now in 2026: does the best tour's journey through the pyramid hierarchy also compress better? Two encodings, both tested. MladiC original, March 23, 2026.
Each step of the tour tagged with which pyramid level it visits. Sequence of level tags compressed with LZ76. Better tours produce more structured, more compressible level sequences.
uy734: level encoding wins on this instance.
Each step encoded as the distance jumped between pyramid levels. Sequence of jumps compressed with LZ76. Better tours jump more purposefully — more compressible deltas.
qa194: 0.011% gap — far ahead of scale_coherence at ≈0.38%. Strictly better signal on this instance.
Both sensors needed. MDL selects per instance — as always. Neither encoding dominates universally. The sensor discovers which zoom-level structure the instance has, empirically.
2024: "Isn't the shortest route the most compressible?" → confirmed (ρ = +0.80).
2026: "Isn't the best tour's journey through scale space the most compressible?" → confirmed.
The hypothesis is fractal. It works at every level we test it.
Scale Walk · MladiC original · March 23, 2026 · scale_walk_delta 0.011% on qa194 · instance-dependent encoding confirmed · Arena v2.6 · AIM³ Lab
Every existing GPU + TSP paper parallelizes the operator layer (2-opt evaluations). Nobody parallelizes the sensor layer. Geometric sensors are embarrassingly parallel — each triple is independent. On GPU: ∼1000× faster sensor evaluation = 1000× more governance cycles per second.
| GPU rank | Sensor | uy734 Gap | Author | What it measures (GPU angle) |
|---|---|---|---|---|
| #1 | gravity ★NEW | 1.184% | BD | Segment centroids. Each segment independent. Barnes-Hut on GPU. |
| #2 | echo ★NEW | 1.245% | BD | K-nearest neighbor misses. KNN is GPU-native. Embarrassingly parallel. |
| #3 | circum_r | 1.296% | BD | Circumradius per triple. Each triple independent. Pure FP math. |
| #4 | southwest | 1.356% | BD | Direction of each step. One atan2 per edge. Trivially parallel. |
| #5 | fold_dist | 1.460% | BD | K random projections. Matrix multiply on GPU = microseconds. |
| #6 | voronoi_adj | 1.531% | BD+LLM | Pairwise adjacency check. Embarrassingly parallel. |
| #7 | tri_area | 1.620% | BD | Triangle area per triple. 3 FP operations, independent. |
| #8 | tri_compact | 1.629% | BD | Area/bbox ratio per triple. Independent per city triple. |
| #9 | qwalk quantum | 1.278% | BD | Quantum walk spread. Each seed walk independent. Embarrassingly parallel. |
| #10 | entanglement quantum | 1.461% | C | Pairwise improvement correlation. Independent pairs. GPU natural. |
| #11 | decoherence quantum | 1.717% | C | Basin coherence loss rate. Window-local computation. Parallelizable. |
On CPU: LZ_dual + PI wins (sequential, information-theoretic).
On GPU: gravity + [law] or echo + [law] wins (parallel, geometric).
The hardware changes which algorithm is optimal. The optimal description depends not just on the data but on the machine that processes it.
All 8 classic GPU-parallel sensors are BD originals. Three additional quantum sensors (qwalk, entanglement, decoherence) are also GPU-parallelizable, bringing the total to 11 GPU-native sensors. The non-CS human who proposed them from bed at 1 AM wrote the fastest governance signals for GPU hardware — not because they are theoretically superior, but because they are architecturally parallel.
Everything tested under identical conditions (uy734, 150s per variant, fleet ×14, Rust backend). MDL decides. Three quantum sensors in the top 10. Tunneling within 0.0013% of the champion. Scale Walk sensors confirmed on qa194 — deep uy734 results pending.
| # | Sensor | Law | Gap% | Author | GPU | Type |
|---|---|---|---|---|---|---|
| 1 | LZ_dual | PI | 0.920% | BD+LLM | ✗ | Info |
| 2 | tunneling ★ Quantum | PI | 0.922% | BD | ✗ | Quantum ★ |
| 3 | xfer_entropy | PI | 0.971% | LLM | ✗ | Info |
| 4 | gravity | ADSR | 1.184% | BD | ✓ | Geo |
| 5 | zstd_ratio | ratchet | 1.198% | LLM | ✗ | Info |
| 6 | CUSUM | hyst_bb | 1.215% | LIT | ✗ | Info |
| 7 | echo | hyst_bb | 1.245% | BD | ✓ | Geo |
| 8 | qwalk ★ Quantum | ratchet | 1.278% | BD | ✓ | Quantum ★ |
| 9 | circum_r | PI | 1.296% | BD | ✓ | Geo |
| 10 | entanglement ★ Quantum | ratchet | 1.461% | C | ✓ | Quantum ★ |
| … | … | … | … | … | … | … |
| 15 | interference Quantum | BB | 1.586% | BD | ✗ | Quantum |
| 18 | decoherence Quantum | PI | 1.717% | C | ✓ | Quantum |
| — | scale_walk_delta ★NEW | PI | 0.011% (qa194) | MladiC | ✗ | Scale |
| — | scale_walk_lz ★NEW | PI | testing | MladiC | ✗ | Scale |
Tunneling within 0.0013% of the champion. The quantum lens works — not as computation, but as a way of seeing the search landscape. Five sensors built from physics concepts discovered performance structures that 25+ years of TSP literature missed.
Scale Walk sensors (MladiC, March 23): scale_walk_delta = 0.011% on qa194 — strictly better than scale_coherence at ≈0.38%. Instance-dependent sensor selection confirmed again. The MDL axiom holds at every new level of abstraction we test it on.
NEW: Sensor effectiveness is BUDGET-DEPENDENT. Short runs favor tunneling (fast escape). Longer runs favor decoherence (basin mapping). This implies the optimal sensor is not fixed — it should switch during the search. The SwitchSensor mechanism already exists in the arena. blend:tunneling+decoherence is the next test.
uy734 · 150s fleet ×14 · Rust 2-opt backend · Arena v2.6 · Scale Walk sensors: qa194 confirmed, uy734 pending deep run · BD × AI Lab · AIM³ Lab · March 2026
Multi-Scale TSP Decomposition confirmed. 4-component architecture (scale pyramid + scale-aware stagnation + scale coherence + hysteresis double-bridge). 0.484 pp improvement over flat best from architecture alone. BD original, March 22.
Tour-Enriched Pyramid: solver teaches pyramid. All three variants (TEP3_beta4.0, basin_L6, basin_L8) hit exact optimal (0.000%). Enrichment proven: same sensor, same law, same budget as flat (0.385%). MladiC original, March 22.
scale_walk_delta: 0.011% on qa194 (vs scale_coherence ≈0.38%). scale_walk_lz: wins on uy734. Founding hypothesis confirmed recursively on pyramid hierarchy. MladiC original, March 23.
Five-phase crossover confirmed: geometric sensors return at n>900, then hybrid at n>2000, then geo+info blend at n>3500. MDL chose a different sensor+law for every instance. No single winner at all scales.
TunnelingSensor: 0.922% — within 0.0013% of champion. All 5 quantum sensors beat baseline. Budget-dependent sensor ranking discovered: tunneling wins short, decoherence wins medium.
--cat tep · budget×120 · 8 workers · weighted affinity · Arena v2.6. Key test: does TEP enrichment improve uy734 at long budget? Is beta monotonic as on qa194? Basin selection: does coarser = better hold?
--cat all · qa194 + uy734 · budget×1.0 · 6 workers. New baseline for all sensors including scale walk. First competitive uy734 ranking for the v2.6 additions.
3496 cities, Nicaragua, current best 1.241% at t≈61h. Three levers for sub-1%: MSTD partitioning, TEP enrichment, and 3D Zoom (elevation as third coordinate — Nicaragua has mountains). Each independent. All three together could be decisive.
Boundary sensor, scale perturbation, Hilbert pyramid, 2-axis DCC — top convergent ideas from 4 LLMs. GPU prototype: 11 sensors on CUDA, ~1000× more governance cycles per second. The eyes get faster.
Five-phase crossover + tunneling sensor (#2 overall) + budget-dependent ranking + hierarchical DCC + MSTD + TEP reversed arrow. Six publishable findings in one system. The formal bridge to P vs NP is the longer horizon.