Active formal frontier arena

Prime Gaps Through the 8Z / DCC Lens

Prime gaps as a measurable information trace: from LZ76 compression signal to wheel/Markov controls, v0.3 baseline diagnostics, v0.4 Markov2-floor generator results, v0.5 smoke ablation, v0.6 carrier isolation, v0.7 carrier-stress results, invariant candidates, and a three-track RH + MDLxDCC roadmap.
Author: Bojan Dobrečevič (BD), with AI-assisted synthesis (GPT, Claude, DeepSeek)
Version: v0.12 RH Arena v0.7 carrier-stress update · Date: May 12, 2026
Status: Preliminary empirical — E1/E2/SEA analyzed, v0.3/v0.4/v0.5/v0.6 preserved, v0.7 20k→150k→500k→1M ladder completed
Project context: 8Z / MDL / DCC cross-domain research program
ACTIVEFinal E2 / SEAInverse generator arenav0.4 1M analyzedMarkov2 floor breachedv0.5 smoke completev0.6 carrier isolatedv0.7 carrier stressed
Working status: E1, E2/SEA, v0.3, v0.4, v0.5, v0.6, and the v0.7 carrier-stress ladder are now analyzed. The v0.7 ladder ran 20k smoke, 150k validation, 500k bridge, and 1M focused carrier stages with 4/4 workers. The 1M run completed 980 window tasks, 1,323 generator score tasks, and 6,615 generator replicates with 0 leakage-guard failures. The sensor layer still says: raw gap survives shuffle, wheel6, Markov1, and Cramér-like controls, while Markov2 remains the hard raw-gap LZ-control boundary. The generator layer still beats Markov2, but v0.7 changes the carrier diagnosis: clock_loglog_markov2 wins raw gap in every tested split at 150k, 500k, and 1M. The previous density signal is better read as a scale/clock transition signal, not as isolated prime-specific density and not as an RH proof.
v0.7 ladder
Clean
20k → 150k → 500k → 1M · 4 workers · 980 1M window tasks · 6,615 1M generator replicates · 0 guard failures
Sensor boundary
Markov2
1M raw gap vs Markov2: Δ +45.13 · z +2.65 · only 5% negative windows
1M raw main
Loglog clock
clock_loglog_markov2 beats Markov2 by −8,784 holdout units
Density placebo
Not isolated
true, lagged, shuffled, and reversed density labels cluster; main split is density/clock tie by margin
Residue track
VarMarkov
gap_mod6 winner remains variable_markov; gap_mod30 winner remains markov3
Boundary
No proof
Empirical signal/generator/law-sheet candidates only; formal bridge still future work

Abstract#

This working paper reframes the previous speculative Riemann Hypothesis brainstorming note into a smaller, testable research program. The empirical question is: Do ordered prime gaps contain measurable compression structure beyond their marginal distribution?

Using the first 20,000 and 100,000 primes, we first compared LZ76 complexity of real prime-gap order against shuffled controls preserving the same multiset. At 100,000 primes, the raw gap encoding produced delta −167.96 and mean z-score −32.38 against shuffled controls.

The final E2/SEA batch now extends the test ladder to stronger controls and larger ranges. It includes 15 runs, 60 summary rows, and 1,960 window rows, covering Markov-preserving, wheel-aware, block-shuffle, and shuffle controls across 100k, 500k, 1M, and 2M prime ranges.

Final SEA finding: the broad E1 shuffle signal is partly explained by local/arithmetic structure, but the raw gap signal survives wheel-aware and Markov-preserving controls from 100k through 2M primes. The 500k bridge is positive, the 1M shuffle baseline is now present, and the strongest line remains the 2M Markov-preserving raw gap test: delta −142.34, z −8.76, with 50/50 windows negative (100 trials at 2M; limited p-value resolution; see Section 10.5).

The v0.4 generator-floor run now tests the next question: can any compact generator beat the Markov2 baseline on holdout? The answer is empirically yes at 1M scale. The run completed 1,620 window tasks and 72 generator tasks. At the sensor/control layer, raw gap still fails against Markov2. At the generator layer, however, density_markov2 wins raw gap/gap_div2, and all 9/9 tested encodings have a non-Markov2 winner beating the Markov2 holdout floor.

The v0.7 carrier-stress ladder now tests the sharper placebo question opened by v0.6: is the raw-gap generator gain really prime-density information, or a more generic monotone scale/clock state? The answer is empirically sharper: at 1M, clock_loglog_markov2 wins raw gap in every split, with Δholdout vs Markov2 −8,784.21 on the main split. density_wrong_scale_markov2 and density-DCC variants remain strong, but shuffled/lagged/reversed density labels cluster closely enough that exact prime-coordinate density is not isolated as the unique cause.

This does not prove RH. It shows a robust, order-sensitive prime-gap compression signal, a repeated empirical breach of the Markov2 generator floor, and now a stronger negative result against pure-density interpretation. The next phase is a corrected law-sheet build plus 2M/5M replication of the top carrier laws.

1. Purpose#

The original brainstorming paper mixed three levels: empirical observations, reasoned interpretations, and metaphysical speculation. This version separates them. The empirical core is now simple: prime gaps are treated as a sequence, the real order is compared against null models, and MDL/LZ76 decides whether additional structure exists.

2. Background: Why Prime Gaps?#

Prime gaps \(g_n = p_{n+1} - p_n\) are not independent. Their distribution changes with scale, and primes obey arithmetic constraints. The sharper question is: After preserving basic statistics, does the ordered sequence still compress better than appropriate controls? That is exactly the kind of question the 8Z/MDL/DCC program is built to ask.

3. DCC / Edge-of-Chaos Motivation#

DCC balances between seizure (excessive order) and noise (excessive disorder). The original speculative analogy placed Re(s)=1/2 as the edge-of-chaos line. This remains a conceptual analogy, not a proof. The empirical test only asks whether prime gaps show measurable structure under compression tools.

4. Important Boundary: Not an RH Proof#

This paper makes no claim to prove the Riemann Hypothesis. It only asserts that ordered prime gaps are more compressible than shuffled controls preserving the same gap distribution. Any bridge to RH remains speculative future motivation.

5. E1 Experiment: Prime-Gap LZ76 Against Shuffled Controls#

We generated the first N primes, computed gaps, and applied several encodings (gap, delta, abs_delta, mod6, bucket_log2). Each real ordered window was compared against shuffled controls. The main statistic is \(\Delta_{LZ} = LZ76(real) - mean(LZ76(control))\); a negative value indicates the real sequence is more compressible.

6. E1 Results#

6.1 First run: 20,000 primes#

Configuration: 20,000 primes, 30 trials, window 5,000. Summary:

EncodingReal LZ76Shuffled LZ76DeltaZ-scoreInterpretation
gap1486.751579.31−92.56−22.85strong
gap_div21486.751579.14−92.39−23.86strong
delta1753.001820.93−67.93−16.72strong, but needs special control
abs_delta1520.001537.62−17.62−4.44moderate/strong
mod6714.75882.10−167.35−88.43sanity check; partly expected
bucket_log21045.251054.22−8.97−3.92weak but present

6.2 Stronger run: 100,000 primes#

Configuration: 100,000 primes, 50 trials, window 10,000. Summary:

EncodingReal LZ76Shuffled LZ76DeltaZ-scoreInterpretation
gap2898.803066.76−167.96−32.38very strong
gap_div22898.803067.18−168.38−32.83very strong
delta3395.103532.89−137.79−23.90very strong, needs control
abs_delta2959.503002.05−42.55−8.25strong
mod61281.401603.32−321.92−127.43sanity check
bucket_log21974.801984.15−9.35−3.01weak but present

Raw prime gaps are about 5.5% more compressible in real order than in shuffled order.

7. Interpretation of E1#

Supported: The order of prime gaps carries structure beyond marginal distribution. Not yet shown: that it survives Markov-preserving controls, that it connects to zeta zeros, or that AC/Zero Framework is correct. mod6 is a sanity check; delta needs caution due to lag-1 artifacts.

8. E2: Stronger Null Controls#

9. E2 Acceptance Criteria#

Minimal success: raw gap outperforms full shuffle, block shuffle, and wheel-aware. Strong success: outperforms Markov surrogates. Very strong: survives all controls and shows scaling growth.

10. Final E2 / SEA Results: Stronger Controls and Scaling#

The E2/SEA batch tested whether the initial shuffle signal survives harder null models and larger prime ranges. SEA means Scaling Experiment Arena. The final uploaded batch is complete: 15 runs, 60 summary rows, and 1,960 window rows.

The final SEA claim is no longer merely “prime gaps beat shuffle.” It is narrower and stronger: raw prime-gap order contains compression-detectable sequential structure not fully explained by marginal gap distribution, mod-6 wheel structure, or first-order Markov transitions, with positive scaling evidence through 2M primes.

10.1 Completed SEA final runs#

The final SEA package includes all planned continuation runs:

This completes the missing 100k wheel replication, the 500k bridge, and the 1M shuffle baseline that were absent from the partial version.

10.2 Final raw gap summary#

RunControlDeltaZ-scoreWindow resultMedian pReading
100k Markov t2000markov−32.64−3.6910/10 negative0.00175survives first-order transitions
100k wheel6 t2000wheel6−19.83−3.7810/10 negative0.00075survives mod-6 wheel
100k block50 t1000block50−14.08−2.5810/10 negative0.01499survives local block shuffle
100k block100 t1000block100−8.70−1.6310/10 negative0.09091marginal / local-scale boundary
500k Markov t700markov−65.13−5.4125/25 negative0.00143survives first-order transitions
500k wheel6 t700wheel6−33.45−4.7525/25 negative0.00143survives mod-6 wheel
500k block50 t700block50−21.95−3.0425/25 negative0.00428survives local block shuffle
500k block100 t700block100−11.48−1.6425/25 negative0.08845marginal / local-scale boundary
1M shuffle t300shuffle−301.62−41.6250/50 negative0.00332E1-style baseline at 1M
1M wheel6 t300wheel6−31.02−4.3950/50 negative0.00332survives mod-6 wheel
1M Markov t300markov−58.26−4.8450/50 negative0.00332survives first-order transitions
1M block50 t300block50−21.10−2.8850/50 negative0.00664survives local block shuffle
1M block100 t300block100−11.13−1.5549/50 negative0.09801marginal / local-scale boundary
2M Markov t100markov−142.34−8.7650/50 negative0.00990survives first-order transitions
2M wheel6 t100wheel6−64.42−7.0150/50 negative0.00990survives mod-6 wheel

10.3 Interpretation of the final SEA result#

Markov survival is now the central result. Markov-preserving controls keep first-order gap transitions, yet raw gap remains more compressible at every tested depth: 100k, 500k, 1M, and 2M.

DepthDeltaZ-scoreNegative windowsMedian p
100k Markov t2000−32.64−3.6910/100.00175
500k Markov t700−65.13−5.4125/250.00143
1M Markov t300−58.26−4.8450/500.00332
2M Markov t100−142.34−8.7650/500.00990

Wheel survival is also confirmed. The raw signal survives mod-6-aware controls at 100k, 500k, 1M, and 2M. This means the signal is not merely the trivial 6k ± 1 residue structure.

DepthDeltaZ-scoreNegative windowsMedian p
100k wheel6 t2000−19.83−3.7810/100.00075
500k wheel6 t700−33.45−4.7525/250.00143
1M wheel6 t300−31.02−4.3950/500.00332
2M wheel6 t100−64.42−7.0150/500.00990

Block controls narrow the scale claim. Block50 remains stable from 100k through 1M. Block100 is consistently weaker and mostly marginal. This suggests that a meaningful part of the signal lives at local-to-medium sequential scales, while the Markov/wheel survival shows the signal is not exhausted by the simplest local arithmetic explanations.

RunDeltaZ-scoreNegative windowsMedian p
100k block50 t1000−14.08−2.5810/100.01499
100k block100 t1000−8.70−1.6310/100.09091
500k block50 t700−21.95−3.0425/250.00428
500k block100 t700−11.48−1.6425/250.08845
1M block50 t300−21.10−2.8850/500.00664
1M block100 t300−11.13−1.5549/500.09801

Sensor ranking is now clearer. Raw gap is the primary sensor. gap_div2 is effectively the same signal. abs_delta and bucket_log2 are diagnostic only: they weaken, flip, or become near-zero under harder controls.

10.4 Final E2 / SEA thesis#

Prime-gap order contains a surviving raw-gap compression signal that is not fully explained by gap distribution, simple wheel residue structure, or first-order Markov transition statistics. The effect is smaller than the initial shuffle signal but more meaningful, and the final SEA batch confirms survival through 100k, 500k, 1M, and 2M prime ranges.

10.5 Caution#

The 2M runs currently use only 100 trials, so the permutation p-value resolution is limited. The z-scores and window stability are more informative at this stage. The final result is strong enough to justify the next phase, but not strong enough to claim an RH proof or a final mathematical invariant.

11. Relation to 8Z/DCC Founding Hypothesis#

The first E2/SEA results support number theory as a credible ninth-domain candidate for the 8Z/DCC approach. The strongest current claim is not that DCC proves RH, but that 8Z/DCC compression tools detect a nontrivial order-sensitive signal in prime gaps that survives first-pass wheel-aware and Markov-preserving controls up to 2M primes.

11.1 Two-track strategy#

This project should not choose prematurely between “prove RH” and “build an independent MDLxDCC theory of prime structure.” Both tracks are useful and should run in parallel:

These are not competing goals. RH is the gold-standard benchmark; the MDLxDCC invariant is the deeper search object.

12. Revised Position on Original RH/AC Hypothesis#

The critical line as edge-of-chaos remains speculative motivation only. Verified: E1 signal. Reasoned: compression advantage suggests order-sensitive structure. Speculative: AC/Zero Framework explanations are not evidence.

13. Mathematics, Reality, and the Zero Framework#

This section records the philosophical motivation behind the Zero Framework while keeping it separate from the empirical E1/E2 result. The central issue is not whether mathematics is "wrong." Formal mathematics can define many internally consistent worlds. The stricter question is:

When a formal object is used to describe reality or nature, what exactly is the mapping between the symbol and the thing being described?

In pure mathematics, definitions are allowed to be abstract. A structure can be studied because it is consistent, elegant, or fruitful. In applied mathematics and mathematical physics, however, a formal object earns physical meaning only through a disciplined bridge: what it measures, what it predicts, what it preserves, and where the mapping stops.

This distinction matters for the present paper because RH lives inside pure mathematics, while the 8Z/DCC framing asks a more reality-facing question: whether prime-gap dynamics expose an information structure that can be measured, compressed, and related to deeper arithmetic regularities. The empirical compression tests do not decide ontology. They only ask whether a measurable signal exists.

13.1 The two zeros#

Standard mathematics treats zero as a single formal object: the additive identity, the result of \(x - x\), and the value attained when a function vanishes. The Zero Framework proposes that ordinary language often mixes two different meanings under the same word:

On this view, a function that equals zero at some point has not "become nothing." It has reached a structured value. A zeta zero is therefore better described as an exact cancellation or destructive-interference point, not an annihilation into non-being.

13.2 Analytic continuation as ontological smuggling#

Analytic continuation is an internally valid and powerful mathematical operation. The concern here is not its consistency. The concern is what happens when a result obtained by extension is presented under the same notation as the original object — without making the substitution explicit.

The Dirichlet series \(\zeta(s) = \sum_{n=1}^{\infty} n^{-s}\) converges for \(\text{Re}(s) > 1\). For \(s = 1\) the harmonic series diverges. For \(s = -1\) the sum \(1 + 2 + 3 + \cdots\) diverges to infinity. Yet analytic continuation assigns \(\zeta(-1) = -\frac{1}{12}\). This is a correct statement about the analytically continued function. But the informal phrase \(1 + 2 + 3 + \cdots = -\frac{1}{12}\) silently substitutes one object for another under the same notation. The two agree where the original sum converges. They are not the same thing where it does not.

The ontological import is this: when the output of an extended object is presented as if it were the output of the original process, the interpretation can carry a hidden metaphysical claim — that the "true value" of a divergent process exists and is reachable by algebraic extension — without stating that claim explicitly or subjecting it to scrutiny. Whether this affects internal consistency is no. Whether it affects what mathematics describes about reality is an open question. This is what we call ontological smuggling: not an error in the proof, but an unacknowledged step in the interpretation.

13.3 Axiom-level vs. interpretation-level critique#

There are two versions of the Zero Framework critique, and they should not be confused:

The historical analogy is Lobachevsky. For two millennia, Euclid's parallel postulate appeared self-evident — not a choice but a necessity. Replacing it with a different axiom produced a geometry that was equally consistent and turned out to be physically more accurate than the original. The Zero Framework at the axiom level is asking whether the identification \(0 \equiv \text{"nothing"}\) is a parallel postulate of arithmetic: a convenience that has been mistaken for a necessity. The interpretation-level critique is conservative and already useful. The axiom-level critique is speculative but not without historical precedent.

13.4 Implication for RH specifically#

The implication for RH is indirect. RH is a formal statement about the zeros of the analytically continued zeta function. The Zero Framework does not currently change that statement, prove it, or disprove it.

What it may do is sharpen the language around what a zero means. If a zeta zero is treated as a structured cancellation point rather than "nothingness," then the research question becomes cleaner:

What structure forces all nontrivial cancellation points of the zeta function onto the critical line?

That question is still mathematical. The 8Z/DCC contribution, if any, would be to search for measurable information-structure invariants in prime-gap dynamics or prime-counting error that might later connect to the zeta-zero structure.

Current status: interpretation-level critique useful; axiom-level critique unformalized; RH implication speculative. This remains L8 in the interpretation ladder unless a formal invariant is discovered.

14. Corrected Language About Zeta Zeros#

Zeta zeros are points of exact cancellation or destructive interference in the analytic structure of the zeta function, not annihilation into ontological nothingness. This correction is an interpretation-level clarification: a function attaining the value zero is not the same as a quantity ceasing to exist.

15. Current Working Thesis#

Ordered prime gaps contain measurable compression structure beyond their marginal distribution. After the final E2/SEA batch, the surviving raw-gap signal also exceeds wheel-aware and first-order Markov-preserving controls from 100k through 2M primes, while block tests show that much of the broader signal remains local-to-medium scale.

The strongest current thesis is narrower than the original speculation but stronger than the initial shuffle-only result. v0.6 showed that the Markov2 floor can be beaten on holdout and that the carrier is scale-conditioned transition structure rather than a clean one-term density explanation. v0.7 sharpens this further: a generic loglog position-clock conditioned Markov2 law beats density variants on raw gap across the tested split ladder. The longer-term target is a formal information-theoretic invariant of prime distribution, not merely another route to the existing RH statement.

Operationally, the project now runs on three tracks: Track A seeks an RH-facing formal bridge; Track B seeks a native MDLxDCC invariant describing prime-gap structure directly; Track C isolates the generator carrier that can beat Markov2 without leakage or split artifacts.

16. Next Experiments#

The v0.7 ladder is now complete through 1M. The next work should not simply widen the arena. The current bottleneck is formal clarity: distinguish generic scale-clock laws from true prime-density information, then ask whether the surviving law has a clean mathematical description.

17. Preliminary Conclusion#

E1 delivered a clear shuffle signal. E2 narrowed and strengthened the result. The final SEA batch now confirms that the broad shuffle advantage is partly local/arithmetic, but the raw gap signal survives wheel-aware and Markov-preserving controls at 100k, 500k, 1M, and 2M.

The current conclusion is therefore:

The prime-gap compression signal is real, smaller under harder controls, and still alive where it matters most: raw gap order survives beyond marginal distribution, simple wheel structure, and first-order transition statistics. The completed SEA run adds a 500k bridge, a 1M shuffle baseline, and stronger scale evidence through 2M primes.

The most stable current signal is not only the aggregate z-score, but the window-level consistency: Markov and wheel controls show negative raw-gap deltas in every tested window at 100k, 500k, 1M, and 2M. The only near-boundary result in the key raw-gap family is block100 at 1M, with 49/50 negative windows and a marginal z-score.

This is not RH evidence yet, but it is a legitimate 8Z/DCC number-theory signal candidate. The deeper goal is to discover a formal invariant that explains prime-distribution structure; RH is the benchmark, not the only possible prize.

18. Three-Track Roadmap: RH, MDLxDCC Invariant, Generator Arena#

The current compression experiments cannot prove the Riemann Hypothesis by themselves. They are empirical signal tests. A proof would require a formal mathematical invariant or theorem that applies to all relevant cases, not only to tested prime ranges.

18.1 Three live tracks#

The roadmap now has three live tracks:

TrackNameTargetStatus
ARH-facing formal bridgeUse the empirical signal to search for a formal invariant connected to prime-counting error, zeta-zero structure, explicit formulas, or an RH-equivalent criterion.Future formal work; not claimed by E1/E2/SEA/Arena.
BMDLxDCC-native prime-structure invariantBuild a new information-theoretic description of prime-gap structure where RH is a benchmark, not the only definition of success.Motivated by surviving compression signal.
CInverse generator arenaSearch over compact generators, grammars, automata, dictionaries, and multi-scale programs that reproduce prime-gap structure better than null models.v0.4 achieved the first 1M Markov2-floor breach on generator holdout; next target is scaling, ablation, and formalization.

This distinction matters. A successful MDLxDCC invariant could be valuable even before it proves RH, and a generator that survives holdout and scaling could become the first concrete object from which a stronger invariant is extracted.

The higher target is not merely “prove RH by compression.” The higher target is to discover a formal invariant or compact generator of prime-distribution structure. RH then becomes one benchmark such an invariant may imply, explain, or sit beside.

18.2 The required bridge#

A possible proof path would need the following form:

  1. Use 8Z/DCC/MDL tools to discover a stable invariant in prime-gap dynamics, prime-counting error, or another RH-adjacent arithmetic trace.
  2. Formalize that invariant as a precise mathematical statement.
  3. Prove that the invariant holds for all relevant values, not merely for tested ranges.
  4. Show that the invariant implies RH, is equivalent to RH, strengthens an RH-style distribution bound, or explains a structure that RH does not directly describe.

In short:

empirical signal → generator candidate → formal invariant → distribution theorem / RH implication / new criterion

18.3 Why not just prove RH?#

RH matters because it is one of the cleanest known formulations of prime-distribution regularity. It would strongly constrain the error term in the prime number theorem and would affect many related results in analytic number theory.

But a new 8Z/DCC invariant could be valuable even before it proves RH, and potentially more valuable if it explains more than RH alone. For example, it could:

Thus RH is not the only possible victory condition. It is the gold-standard benchmark. The deeper goal is a new invariant that explains prime-distribution structure in a way that can later be connected to RH or to a stronger distribution theorem.

18.4 Prime-counting error route#

One possible route is through prime-counting error. RH is deeply connected to how tightly prime-counting functions stay near their expected main terms. A compression-based approach would need to discover a bound or regularity in the prime-gap trace that implies an appropriate bound on prime-counting error.

A rough proof-seed form would be:

If every prime-gap window has bounded normalized MDL excess,
then prime-counting error remains within an RH-compatible bound.

This is not yet a theorem. It is a target shape for future formalization.

18.5 Explicit-formula route#

The most RH-adjacent route would connect compression structure in prime gaps to the oscillatory terms generated by zeta zeros in explicit formulas. In that direction, the desired contradiction would look like this:

  1. Assume an off-critical-line zero exists.
  2. Show that such a zero forces a detectable oscillatory or compressible signature in prime-counting error or prime-gap dynamics.
  3. Show that a proven MDL/DCC invariant forbids that signature.
  4. Conclude that no off-critical-line zero can exist.

Symbolically:

off-line zero → forbidden oscillation/compression signature → contradiction → RH

18.6 Equivalent-criterion route#

Another route is discovery of a new RH-equivalent condition. 8Z/DCC could search for a new boundedness, monotonicity, spectral, or compression condition that appears empirically stable. The mathematical task would then be:

  1. prove the new condition is equivalent to RH or implies RH,
  2. then prove the new condition directly.

This may be more realistic than directly proving statements about zeta zeros from compression data.

18.7 Proof-roadmap phases#

PhaseGoalStatus
1Detect empirical signal in prime-gap ordercompleted first pass; E1 positive
2Test signal against stronger null controlsfinal E2/SEA pass positive; deeper tests needed
3Extract LZ76 dictionaries, motifs, and multi-scale MDL spectrumstarted by RH Arena v0.1/v0.2b
4Run inverse generator benchmark with holdout validationstarted; Markov baseline currently strongest
5Repair controls and scoring after 1M Arena diagnosisv0.3 running
6Relate surviving generator/invariant to prime-counting error or explicit-formula termsplanned
7Extract candidate invariantfuture
8Prove invariant and distribution/RH implication where possiblefuture
9Formal verification where possiblefuture

18.8 Current position#

The current strongest honest formulation is:

8Z/DCC may help discover a new RH-adjacent regularity by treating prime gaps as a multi-scale information trace and searching for stable compression or generator invariants that bound or constrain prime-distribution error. RH is the benchmark; the deeper prize is the invariant. The project should pursue all three tracks: RH-facing proof bridge, MDLxDCC-native structure invariant, and inverse generator arena.

This keeps the door open without pretending that E1, E2, SEA, or the current RH Arena runs can prove RH by themselves.

19. Inverse Generator Hypothesis#

RH gives a constraint on prime-distribution error. MDLxDCC asks the inverse question:

Can we discover a compact generator, grammar, automaton, dictionary, or multi-scale program that reproduces prime-gap structure better than shuffle, wheel, Markov, or Cramér-like null models?

This is not necessarily a search for a one-line formula for primes. The target is the shortest useful generator of prime-gap structure: an algorithmic or MDL formula that explains more of the real trace than competing null models.

If such a generator survives holdout and scaling, it becomes an invariant candidate. If the invariant later constrains prime-counting error, explicit-formula residuals, or zero-spacing structure, it may become an RH bridge — or a separate MDLxDCC-native path to prime-distribution theory.

19.1 Arena flow#

trace → encoding → sensors → controls → generators → MDL score → holdout validation → invariant candidate

The critical change is that controls are no longer only adversaries. They become baselines that generators must beat. A generator that cannot beat Markov, wheel, or Cramér baselines is not yet an invariant candidate, even if it produces visually plausible prime-gap structure.

19.2 Generator classes#

ClassRoleInterpretation
G0 shufflefrequency baselineDestroys order; useful but weak.
G1 wheel-onlyarithmetic residue baselineCaptures simple modular constraints.
G2/G3 Markovtransition baselineTests whether signal is mostly local first/second-order dynamics.
G4 motif grammardictionary/motif baselineTests recurring gap-token phrases.
G5 LZ dictionarycompressive replay baselineTests whether dictionary fragments can generate realistic structure.
G6 local-density Cramérdensity trend baselineTests whether changing prime density explains the trace.
G7 hybrid DCCcandidate engineCombines local density, motif grammar, Markov fallback, and wheel/density fallback under MDL scoring.

19.3 Acceptance rule#

A generator becomes interesting only if it satisfies all of these:

  1. trained only on early data, evaluated on later holdout data,
  2. beats simpler controls under total_mdl_score, not merely raw accuracy,
  3. matches LZ76, entropy, motif spectrum, and next-gap predictive cost within acceptable error,
  4. survives scaling across 100k → 500k → 1M → 2M and beyond,
  5. produces a compact description that can be inspected mathematically.

In this language, the Arena does not “prove RH.” It searches for compact generator/invariant candidates. Proof would be a later certificate.

20. RH Arena v0.3 Results#

The completed v0.3 Arena run is the first cleaned-up version after the v0.2b diagnosis. It separates gap_mod6/gap_mod30 from prime_residue6/prime_residue30, uses wheel-aware Cramér controls with warmup, excludes the Cramér global window-0 artifact from summary aggregation, and scores generators using predictive bits over the holdout length.

20.1 Run integrity#

FieldValue
Script8z-rh-arena-v0.3
Prime range1,000,000 primes; last prime 15,485,863; 999,999 gaps
Train / test500,000 / 1,000,000 primes
Window / stride50,000 / 50,000
Trials400 per control task; p-value floor ≈ 0.00249
Workers10 requested / 10 used
Completed work2,200 window tasks; 88 generator tasks; 30,491 motif rows; 16,892 invariant-candidate rows
Elapsed52,970 seconds, about 14.7 hours

The run completed cleanly with windows.done.json, generators.done.json, and run.done.json. The result is usable as a v0.3 baseline.

20.2 Core sensor readout#

The core raw-gap signal remains real under most controls, but Markov2 is still the hard boundary.

TraceControlMean ΔLZMean zWindow stabilityReadout
gapshuffle−785.94−77.4920/20 negativeStrong order signal.
gapblock10−241.47−21.3220/20 negativeSurvives short-block preservation.
gapmarkov1−204.54−11.3520/20 negativeSurvives first-order transition baseline.
gapwheel6−86.68−8.6620/20 negativeSurvives simple wheel-aware control.
gaplocal_cramer−165.82−7.6920/20 negativeSurvives wheel-aware local-density surrogate.
gapcramer_global−124.69−5.8019/19 negativeSurvives after excluding window-0 artifact.
gapblock50−57.68−5.5420/20 negativeStill stable.
gapblock100−29.86−2.9220/20 negativeWeaker but present.
gapwheel30−8.86−0.8816/20 negativeMostly absorbed by stronger wheel structure.
gapmarkov2+44.71+2.591/20 negativeSignal killed/reversed; current boundary.

gap_div2 repeats the same picture almost exactly. This confirms that the core raw-gap signal is not a token-scaling artifact.

20.3 Residue and Cramér readout#

The v0.3 split clarified the old residue confusion. gap_mod6 and gap_mod30 contain strong, stable gap-residue motif structure. prime_residue6 and prime_residue30 are now explicit prime-residue traces and should be treated as wheel-state diagnostics, not as raw evidence for a new invariant.

TraceControlMean zReadout
gap_mod6markov2−29.53Gap-residue structure survives second-order local baseline.
gap_mod30markov2−2.44Weaker but still negative in 20/20 windows.
prime_residue6markov2−0.15Mostly absorbed; useful as wheel-state diagnostic.
prime_residue30markov2+1.02Mostly absorbed/reversed; not a primary invariant signal.
gap_mod6local_cramer+14.20Cramér surrogate over-explains or mismatches this residue view; read cautiously.
gap_mod30local_cramer+14.93Same caution.

The earlier huge residue/Cramér readout is no longer treated as a discovery by itself. It is now a diagnostic showing where surrogate design and residue-state modeling still need care.

20.4 Generator leaderboard#

The generator arena changed meaningfully after predictive-bits scoring. For raw gaps, markov2 now beats markov1, and hybrid_dcc moves to second place but still does not beat the Markov2 baseline.

EncodingWinnerSecondThirdReadout
gapmarkov2hybrid_dccmarkov1Markov2 is the current raw-gap baseline to beat.
gap_div2markov2hybrid_dccmarkov1Same as raw gap.
abs_deltamarkov2hybrid_dccmarkov1Second-order transition structure dominates.
gap_mod6markov2hybrid_dccmarkov1Strong residue-state transition structure.
gap_mod30markov2hybrid_dccmarkov1Same hierarchy at finer wheel scale.
prime_residue6markov2hybrid_dccmarkov1Mostly wheel-state transition modeling.
prime_residue30markov2hybrid_dccmarkov1Mostly wheel-state transition modeling.
cumulative_error_walkmarkov1motif_grammarlz_dictionaryStill interesting, but not yet a clean invariant.
normalized_gaplocal_cramershufflewheel_onlyDensity trend dominates this encoding.
gap_minus_logpmarkov1motif_grammarlz_dictionaryNeeds better density-corrected generator design.

Current generator verdict: the Arena has not yet found a DCC/motif generator that beats Markov2 on raw gaps. That is not a failure. It is the clean next target.

20.5 Invariant candidates#

The v0.3 run emitted invariant_candidates.csv. The strongest stable candidates are not yet raw-gap formulas. They are mostly gap-residue and coarse gap-class motif families.

EncodingCandidate motifControls survivedWindow supportMin zReadout
gap_mod62,0,492035.09Very strong residue motif family.
gap_mod64,0,272035.85Companion residue motif.
gap_mod62,4,262037.51Stable alternating residue motif.
gap_mod304,6,146149.67Finer wheel-residue candidate.
gap_mod3010,2,105178.61Stable finer gap-residue motif.
bucket_log24,1,47205.03Coarse size-class candidate.
gap2,12,104102.30Weak raw-gap candidate; not invariant-level yet.

This is the right shape of result for an early inverse-generator arena: it does not magically produce an RH invariant, but it tells us where the stable motifs live. The strongest motifs are currently residue/gap-class structures. Raw integer gap motifs are weaker and need deeper filtering.

20.6 Conclusion and next step#

The v0.3 result sharpens the project:

v0.3 does not close the RH problem. It gives a cleaner target: build a generator or invariant that beats Markov2 on holdout while preserving the stable motif and residue-family structure discovered by the Arena.

The next code step should be a focused v0.4 generator arena: Markov2+DCC hybrids, variable-order Markov, motif-conditioned Markov, density-feedback states, and per-encoding generator leaderboards. The next paper step is to update this page after v0.4 shows whether any compact generator can actually beat the Markov2 floor.

21. RH Arena v0.4 Results: Markov2 Floor Attack#

The v0.4 Arena run executes the next step proposed by v0.3: stop asking only whether raw gaps survive controls, and ask whether a compact generator can beat the markov2 holdout floor. This section should be read with a strict split:

v0.4 does not remove the Markov2 boundary from the LZ control test. It crosses the Markov2 floor in the generator arena.

21.1 Run integrity#

FieldValue
Script8z-rh-arena-v0.4
Schemarh_arena_schema_v0.4
Prime range1,000,000 primes; last prime 15,485,863; 999,999 gaps
Train / test500,000 / 1,000,000 primes
Window / stride50,000 / 50,000
Trials400 per control task; p-value floor ≈ 0.00249
Workers10 requested / 10 used
Encodingsgap, gap_div2, abs_delta, bucket_log2, normalized_gap, gap_minus_logp, cumulative_error_walk, gap_mod6, gap_mod30
Control modesshuffle, block50, block100, wheel6, wheel30, markov1, markov2, local_cramer, cramer_global
Generator modesmarkov2, markov3, markov4, variable_markov, motif_markov2, density_markov2, markov2_dcc, density_dcc
Completed work1,620 window tasks; 72 generator tasks; 10,696 motif rows; 5,826 invariant-candidate rows
Elapsed31,280.007 seconds, about 8 h 41 min 20 s

The run completed cleanly with windows.done.json, generators.done.json, and run.done.json. No completed tasks came from checkpoint; this was a fresh complete run.

21.2 Sensor readout: Markov2 still kills raw-gap LZ advantage#

The raw-gap sensor picture remains almost exactly the v0.3 lesson. gap survives many controls, including Markov1 and wheel-aware Cramér, but not Markov2.

ControlMean ΔLZMean zWindow stabilityMedian pReadout
shuffle-785.94-77.4920/20 negative0.00249Strong order signal; same broad baseline as before.
markov1-204.54-11.3520/20 negative0.00249Survives first-order transition baseline.
wheel6-86.68-8.6620/20 negative0.00249Survives simple mod-6 wheel-aware control.
local_cramer-165.82-7.6920/20 negative0.00249Survives wheel-aware local-density surrogate.
cramer_global-124.69-5.8019/19 negative0.00249Survives after excluding the known window-0 artifact.
block50-57.68-5.5420/20 negative0.00249Stable local/medium-scale signal.
block100-29.86-2.9220/20 negative0.00249Weaker but still negative in all windows.
wheel30-8.86-0.8816/20 negative0.15835Mostly absorbed by stronger wheel structure.
markov244.71+2.591/20 negative0.99875Signal killed/reversed; sensor-level boundary.

gap_div2 repeats the same pattern: Markov2 is again positive/reversed, while shuffle, Markov1, wheel6, block, and Cramér controls remain negative. The important interpretation is that the LZ sensor no longer says “raw gaps beat everything.” It says “raw gaps beat first-order/local/wheel/Cramér controls, but second-order transition memory is enough to absorb this sensor.”

21.3 Generator floor: Markov2 is beaten on holdout#

The generator result is the new signal. The v0.4 challengers were designed specifically to attack the Markov2 floor. They succeeded across all tested encodings.

EncodingWinnerRunner-upΔ holdout vs Markov2Δ bits/token vs Markov2Winner margin
gapdensity_markov2density_dcc-7,953.28-0.0159464,297.46
gap_div2density_markov2density_dcc-7,954.26-0.0159464,298.46
abs_deltadensity_markov2density_dcc-7,262.93-0.0145445,486.79
bucket_log2density_markov2markov2-1,329.38-0.0027551,329.38
normalized_gapmarkov2_dccdensity_dcc-1,568.44-0.00072343.88
gap_minus_logpdensity_dccmarkov2_dcc-372,137.19-0.7433872,470.89
cumulative_error_walkdensity_dccmarkov2_dcc-112,715.09-0.22545314,854.76
gap_mod6variable_markovmarkov4-50,185.35-0.1003305,160.61
gap_mod30markov3density_markov2-40,125.27-0.08043839,471.28

For raw gap, the first accepted breach is not a complicated motif grammar. The best generator is density_markov2: a Markov2 spine conditioned by local density state. The DCC hybrid is second, meaning DCC helps but is not yet the simplest winning explanation.

RankGeneratorHoldout scoreΔ holdout vs Markov2Bits/tokenΔ bits/tokenMotif distanceLZ error
1density_markov21,936,623.45-7,953.283.872776-0.0159460.010280.02110
2density_dcc1,940,920.91-3,655.823.881239-0.0074830.034940.03376
3markov21,944,576.730.003.8887220.0000000.012620.02565
4markov2_dcc1,947,542.122,965.393.8944770.0057550.050360.04351
5motif_markov21,965,111.9020,535.173.9295580.0408360.074540.06210

v0.4 generator verdict: the Markov2 floor is no longer unbroken. The best current raw-gap explanation is Markov2 plus local density state, with DCC hybrids close behind but not yet dominant.

21.4 Invariant candidates: robust motifs still live mostly in residue space#

The invariant-candidate file contains 5,826 rows. Of these, 168 survive at least 6 controls, 39 survive at least 8 controls, and 22 survive all 9 controls. Among the all-control survivors, 21/22 are gap_mod6 motif families and one is bucket_log2.

EncodingMotifLengthControlsWindow support sumMin zMedian zMedian liftTotal countScore
gap_mod62,0,0,4,25917520.1593.051.979148,955141.27
gap_mod64,2,0,0,45917119.4389.341.956143,686136.21
gap_mod62,4,2,0,4591714.4834.951.208196,36731.40
bucket_log23,3,4,149913.945.531.18613,19927.60
gap_mod64,2,0,4491763.1769.151.400357,52822.23

This confirms the v0.3 motif lesson at higher focus: the most stable candidate layer is still residue/gap-class structure, especially gap_mod6. Raw integer gap motifs exist, but they are weaker and do not yet look like the final invariant object.

21.5 Conclusion and next step#

v0.4 changes the project state from “Markov2 is the generator floor to beat” to “Markov2 has been beaten empirically at 1M by density-conditioned generators; now prove the gain survives scaling, ablation, and formalization.”

The v0.5 smoke now changes the next code step: scale the whole ablation ladder, not only density_markov2/density_dcc. At smoke scale, density_markov1 is the raw-gap winner, variable_markov is the gap_mod6 winner, and DCC/motif terms remain candidates rather than confirmed carriers. The next paper step is to compress the stable gap_mod6 motif families and the raw-gap density-Markov gain into one smaller invariant candidate.

22. RH Arena v0.5 Smoke Results: Density Ablation and Split Stability#

The first v0.5 run is a smoke-scale ablation, not a replacement for the v0.4 1M result. Its value is that it cleanly separates the raw Markov2 boundary from the generator layer and begins isolating which ingredient carries the generator gain.

v0.5 smoke verdict: the raw-gap LZ sensor boundary is still Markov2, but the generator floor is beaten again. At 120k scale the raw-gap winner is density_markov1, not a motif/DCC hybrid. This points to density-conditioned transition structure as the first ingredient to scale-test.

22.1 Run integrity#

FieldValue
Script8z-rh-arena-v0.5
Schemarh_arena_schema_v0.5
Run size120,000 primes; last prime 1,583,539; 119,999 gaps
Train / test60,000 / 120,000 primes
Windows20,000-token window; 20,000 stride; 6 windows per encoding/control
Trials50 sampled controls; p floor 0.019608
Workers1 requested / 1 used
Completed work120 window tasks; 272 generator tasks; 544 generator replicates
Generator splitsmain, early, mid, late
Density guardsafe mode; safe density generators do not use actual holdout prime coordinates; oracle diagnostic intentionally flagged

22.2 Sensor readout: Markov2 still absorbs raw-gap LZ advantage#

The sensor layer remains consistent with the earlier v0.3/v0.4 story. Raw gap and gap_div2 are strongly more compressible than shuffle, wheel6, Markov1, and local Cramér controls, but not under Markov2. The gap_mod6 residue trace remains strongly structured even under Markov2.

EncodingControlMean ΔLZMean zNegative windowsReadout
gapshuffle−343.81−54.766/6strong order signal
gaplocal_cramer−151.40−9.816/6survives wheel-aware local density control
gapmarkov1−83.46−6.586/6survives first-order transition control
gapmarkov2+25.79+2.191/6raw-gap LZ advantage absorbed/reversed
gap_div2markov2+26.32+2.201/6same boundary as raw gap
gap_mod6markov2−73.37−14.586/6residue-state structure survives Markov2
normalized_gapmarkov2+152.83+7.770/6density-adjusted trace is not a raw sensor win here

22.3 Generator ablation: density_markov1 wins raw gap at smoke scale#

The generator layer is where v0.5 changes the working hypothesis. On raw gap/gap_div2, the best safe non-Markov2 generator is density_markov1 across all four tested splits. That is a useful simplification: at this scale, adding Markov order 2, motif injection, or full DCC does not improve the raw-gap winner.

SplitEncodingWinnerΔholdout vs Markov2Runner-upReadout
maingapdensity_markov1−2,423.76density_wrong_scale_markov2density + first-order transition wins raw gap
earlygapdensity_markov1−2,343.44density_wrong_scale_markov2same winner in early split
midgapdensity_markov1−3,146.49density_onlydensity itself becomes strong
lategapdensity_markov1−4,150.22density_onlydensity signal strengthens later
maingap_div2density_markov1−2,424.96density_wrong_scale_markov2same as raw gap
maingap_mod6variable_markov−5,853.89markov3residue trace prefers variable-order memory
mainnormalized_gapdcc_only−39,430.74density_dcc_fullread as density-adjusted diagnostic, not raw-gap invariant

The leakage guard is also behaving as intended: all safe density generators are marked non-leaking, while density_oracle_markov2 is deliberately flagged as using actual test-prime coordinates and should not be used as a public winner.

22.4 Conclusion and next step#

v0.5 changed the next test: do not only scale density_markov2. Scale the whole ablation ladder — density_only, density_markov0, density_markov1, density_markov2, density-DCC variants, and variable Markov — and ask which ingredient survives at 1M, 2M, and beyond.

v0.6 update: that scaling test has now been run through 1M. The smoke winner density_markov1 did not remain the raw-gap winner; the scaled story is density/scale vs generic clock-DCC.

23. RH Arena v0.6 Carrier Isolation: Density vs Clock vs Residual#

The v0.6 run answers the question opened by v0.5: was the raw-gap generator gain really a density signal, or mostly a generic position/scale clock? The 4-worker sequence ran three stages — 150k smoke, 500k bridge, and 1M carrier isolation — with the optional 2M raw-carrier stage left off.

v0.6 verdict: the Markov2 generator floor is still beaten at 1M, but the carrier is not clean pure density. The full raw-gap split is won by density_wrong_scale_markov2; the mid and late raw-gap splits are won by clock_dcc_no_motif. The best current description is scale-conditioned transition structure, with density and clock-DCC variants both active.

23.1 Run integrity and scale sequence#

StagePrimesTrainTrialsWorkersElapsedWindow tasksGenerator replicatesGuard failures
150k smoke150,00075,000504/41.144 h1208000
500k bridge500,000250,000754/45.957 h3601,8000
1M carrier1,000,000500,0001004/418.199 h8403,0000

The 1M carrier run used 1,000,000 primes, 500,000 train primes, 50,000-token windows, 100 sampled controls, and 4/4 workers. It completed 840 window tasks and 3,000 generator replicates. The leakage guard is clean: 0 failed guard replicates and no oracle-density diagnostics in the default v0.6 run.

23.2 Sensor readout: raw-gap Markov2 boundary remains#

The sensor layer remains consistent with v0.3–v0.5: raw gap is strongly more compact than shuffle, wheel6, Markov1, and Cramér-like controls, but Markov2 still absorbs or reverses the raw-gap LZ advantage.

1M raw gap controlMean ΔLZMean zNegative fractionMedian pReading
shuffle-785.97-76.55100.0%0.00990huge order signal
wheel6-86.50-8.70100.0%0.00990survives mod-6 wheel
wheel30-8.61-0.8475.0%0.19802weak / near local wheel boundary
markov1-204.73-11.49100.0%0.00990survives first-order transition
markov244.972.605.0%1.00000absorbed/reversed boundary
local_cramer-165.53-7.62100.0%0.00990survives local Cramér
cramer_global-124.34-5.81100.0%0.00990survives global Cramér

The stronger residue traces remain visible: gap_mod6 vs wheel30 has ΔLZ −1,476.84 and z −362.69; gap_mod6 vs Markov2 still has ΔLZ −198.95 and z −29.99. This keeps the residue-state track separate from the raw-gap generator track.

23.3 Generator carrier: density wins main, clock-DCC wins mid/late#

At 1M, non-Markov2 generators still beat the Markov2 holdout floor. The key result is the split diagnosis: density wins the full/main raw-gap split, while clock-DCC wins the mid and late raw-gap splits.

SplitWinnerΔ vs Markov2Best clockBest densityBest residualDiagnosis
earlydensity_wrong_scale_markov2-4,057.07clock_dcc_no_motifdensity_wrong_scale_markov2density_residual_dcc_no_motifdensity_beats_generic_clock
maindensity_wrong_scale_markov2-8,641.49clock_dcc_no_motifdensity_wrong_scale_markov2density_residual_dcc_no_motifdensity_beats_generic_clock
midclock_dcc_no_motif-1,402.59clock_dcc_no_motifdensity_dcc_no_motifdensity_residual_dcc_no_motifgeneric_clock_beats_density
lateclock_dcc_no_motif-1,414.21clock_dcc_no_motifdensity_dcc_no_motifdensity_residual_dcc_no_motifgeneric_clock_beats_density

For the 1M main raw-gap split, the top generator leaderboard is:

RankGeneratorHoldout scoreΔ holdout vs Markov2Δ predictive bitsΔ bits/tokenGuard
1density_wrong_scale_markov21,935,933.72-8,641.49-8,642.41-0.017285pass
2density_dcc_no_motif1,935,952.28-8,622.92-8,667.85-0.017336pass
3density_lagged_markov21,936,595.22-7,979.99-7,980.31-0.015961pass
4density_markov21,936,595.39-7,979.81-7,980.31-0.015961pass
5density_shuffled_markov21,936,595.42-7,979.79-7,980.31-0.015961pass
6clock_dcc_no_motif1,942,231.68-2,343.53-2,277.21-0.004554pass
7density_residual_dcc_no_motif1,942,242.75-2,332.45-2,272.05-0.004544pass
8clock_sqrt_markov21,942,629.68-1,945.52-1,905.45-0.003811pass
9clock_wrong_scale_markov21,943,066.91-1,508.30-1,468.36-0.002937pass
10clock_square_markov21,943,262.57-1,312.63-1,275.22-0.002550pass
11clock_markov21,943,369.42-1,205.78-1,166.67-0.002333pass
12clock_lagged_markov21,943,369.59-1,205.61-1,166.04-0.002332pass

The close cluster of density_markov2, density_lagged_markov2, and density_shuffled_markov2 remains important: it says scale state matters, but exact prime-coordinate density is not yet isolated as the sole cause. The new seed is clock_dcc_no_motif, because it wins mid/late raw-gap splits and is the best clock-family challenger.

23.4 Encoding separation: raw scale carrier vs residue memory#

EncodingWinnerΔ vs Markov2Carrier diagnosisReading
gapdensity_wrong_scale_markov2-8,641.49density_beats_generic_clockraw carrier: density-scale main split
gap_div2density_wrong_scale_markov2-8,641.19density_beats_generic_clocksame as raw gap
gap_minus_logpclock_markov1-1,051,637.89generic_clock_beats_densitygeneric clock dominates density-adjusted residual scale
normalized_gapdensity_dcc_full-3,360.80density_beats_generic_clockDCC-density wins normalized scale
gap_mod6variable_markov-50,173.70generic_clock_beats_densityhigher / variable-order memory
gap_mod30markov3-40,124.87density_beats_generic_clockhigher-order Markov memory

This separation is useful. Raw gap/gap_div2 are now a scale-carrier problem. gap_mod6 and gap_mod30 are not primarily density-clock stories; they prefer higher-order or variable-order memory.

23.5 Invariant candidates#

The 1M v0.6 run produced 2,050 invariant candidates. High-control survivors remain concentrated in gap_mod6, with one sparse gap_div2 survivor.

ThresholdCandidate count
≥ 3 controls271
≥ 4 controls97
≥ 5 controls34
≥ 6 controls7
≥ 7 controls4

All-control candidates in this run:

EncodingMotifLengthControlsSupport sumMin zMedian z
gap_mod64,2,0,4471003.13137.14
gap_mod64,2,0,4,257932.8372.05
gap_mod62,0,4,247882.43139.43
gap_div24,3,2,7,215777.2519.23

The highest-support all-control motifs again live mostly in the residue alphabet. They are strong empirical candidates, not theorem-ready invariants.

23.6 Conclusion and next step#

Do not claim that density explains the prime-gap signal. The current result says that density, generic clock, and clock-DCC are now separated enough to design the next sharper arena.

The next code step should be a smaller v0.7 carrier-stress arena: hold density_wrong_scale_markov2, density_markov2, density_lagged_markov2, density_shuffled_markov2, clock_dcc_no_motif, and density_residual_dcc_no_motif in direct paired competition, with stricter split-consistency penalties and an optional focused 2M raw-gap carrier run.

24. RH Arena v0.7 Carrier Stress: Density Placebo vs Generic Clock#

The v0.7 run is the surgical test proposed after v0.6. It asks whether the raw-gap generator gain comes from true train-extrapolated prime-density information, a generic monotone position/scale clock, wrong-scale regularization, or DCC state-conditioning. The ladder ran from 20k through 1M with 4 workers.

v0.7 verdict: the Markov2 generator floor is still beaten, but the raw-gap carrier is not isolated prime density. At 1M, clock_loglog_markov2 wins raw gap in all seven tested splits. Density variants remain strong, especially in the main split, but the density-placebo cluster shows that generic scale-clock state is the stronger current explanation.

24.1 Run integrity and ladder#

StagePrimesTrainTrialsWorkersElapsedWindow tasksGenerator tasksReplicatesGuard failures
20k smoke20,00010,000104/40.22s802162160
150k validation150,00075,000404/42.210 h52594528350
500k bridge500,000250,000604/413.494 h980132352920
1M focused1,000,000500,000804/441.416 h980132366150

The 1M focused stage used 1,000,000 primes, 500,000 train primes, 50,000-token windows, 80 sampled controls, and 4/4 workers. It completed 980 window tasks, 1,323 generator score tasks, and 6,615 generator replicates. The leakage guard is clean: 0 failed guard replicates and no oracle-density diagnostics.

24.2 Sensor readout: raw-gap Markov2 boundary still holds#

The v0.7 sensor layer preserves the v0.3–v0.6 boundary. Raw gap is strongly more compact than shuffle, Markov1, wheel6, and Cramér-like controls, but Markov2 absorbs/reverses the LZ advantage.

1M raw gap controlMean ΔLZMean zNegative fractionMedian pDict overlap
shuffle-786.00-76.91100.0%0.012350.165
markov1-204.84-11.55100.0%0.012350.230
wheel6-86.65-8.77100.0%0.012350.250
local_cramer-164.90-7.64100.0%0.012350.230
cramer_global-124.31-5.86100.0%0.012350.231
wheel30-8.51-0.8475.0%0.197530.294
markov245.132.655.0%1.000000.271

The residue track is still separate: gap_mod6 vs Markov2 has ΔLZ −199.06 and z −30.59; gap_mod30 vs Markov2 has ΔLZ −31.17 and z −2.50.

24.3 Generator carrier: loglog clock wins raw gap across splits#

SplitWinnerΔ vs Markov2Best densityDensity − clock ΔDiagnosis
earlyclock_loglog_markov2-4378.92density_wrong_scale_markov2321.61generic_clock_beats_density
mainclock_loglog_markov2-8784.21density_wrong_scale_markov2142.46density_clock_tie
midclock_loglog_markov2-5187.52density_dcc_no_motif4552.44generic_clock_beats_density
lateclock_loglog_markov2-5245.96density_dcc_no_motif4579.67generic_clock_beats_density
rolling1clock_loglog_markov2-6009.94density_wrong_scale_markov2348.21generic_clock_beats_density
rolling2clock_loglog_markov2-6248.92density_dcc_no_motif5526.25generic_clock_beats_density
rolling3clock_loglog_markov2-6293.48density_dcc_no_motif5484.38generic_clock_beats_density

For the 1M main raw-gap split, the top generator leaderboard is:

RankGeneratorFamilyHoldout scoreΔ holdout vs Markov2Δ bits/tokenGuard
1clock_loglog_markov2clock_placebo1,935,792.89-8,784.21-0.017473pass
2density_wrong_scale_markov2density_ablation1,935,935.36-8,641.74-0.017285pass
3density_dcc_no_motifdcc_ablation1,935,953.87-8,623.23-0.017336pass
4density_reversed_markov2density_placebo1,936,557.75-8,019.35-0.015961pass
5density_lagged_markov2density_ablation1,936,596.81-7,980.29-0.015961pass
6density_markov2density_ablation1,936,597.07-7,980.03-0.015961pass
7density_shuffled_markov2density_ablation1,936,597.07-7,980.03-0.015961pass
8clock_log_markov2clock_placebo1,937,320.22-7,256.88-0.014410pass
9clock_dcc_no_motifclock_dcc_ablation1,942,238.61-2,338.49-0.004554pass
10density_residual_dcc_no_motifdensity_residual_ablation1,942,249.72-2,327.38-0.004544pass
11clock_sqrt_markov2clock_ablation1,942,631.62-1,945.48-0.003811pass
12clock_wrong_scale_markov2clock_ablation1,943,068.84-1,508.26-0.002937pass

The main split is close: clock_loglog_markov2 beats density_wrong_scale_markov2 by only 142.46 holdout units, so it is a density/clock tie under the configured carrier margin. But the split ladder is not close: early, mid, late, and rolling splits consistently prefer generic log/loglog clock states.

24.4 Density/clock placebo matrix#

The density placebo matrix is the main new information. On the 1M main raw-gap split, density_markov2, density_lagged_markov2, and density_shuffled_markov2 all sit at about Δ −7,980 vs Markov2; density_reversed_markov2 is also strong at Δ −8,019. This cluster means exact prime-coordinate density is not isolated as the unique cause. The signal behaves more like a monotone scale-state transition law.

By encoding, the 1M main split reads:

EncodingWinnerΔ vs Markov2Carrier diagnosis
cumulative_error_walkclock_loglog_markov2-163,772.17generic_clock_beats_density
gapclock_loglog_markov2-8,784.21density_clock_tie
gap_minus_logpclock_loglog_markov2-456,581.80generic_clock_beats_density
gap_mod30markov3-40,124.81density_beats_generic_clock
gap_mod6variable_markov-50,186.51generic_clock_beats_density
normalized_gapdensity_dcc_full-3,200.57density_beats_generic_clock
prime_residue30density_markov2-529.95density_clock_tie

24.5 Candidate law sheets#

The new candidate_law_sheets.csv/json output works as a formalization-facing summary. For raw gap, the top law is clock_loglog_markov2: split count 7, split-win count 7, mean Δ vs Markov2 -6,021.28, best Δ -8,784.21, and candidate law “position-clock conditioned transition law.”

For the residue track, the top law remains gap_mod6 variable_markov: split count 7, split-win count 7, mean Δ vs Markov2 -25,018.00, best Δ -50,186.51, and candidate law “symbolic Markov transition law.”

Implementation note: the v0.7 law-sheet failure-mode column over-penalizes all candidates with missing_splits=rolling. The generator output contains rolling1, rolling2, and rolling3; the law-sheet checker expects literal rolling. This is a reporting/hygiene bug, not a generator-result failure. It should be fixed in v0.7.1 before public release.

24.6 Conclusion and next step#

v0.7 moves the project from “density/clock/DCC are all alive” to a sharper statement: the raw-gap holdout gain is best explained by generic log/loglog scale-clock conditioned transition structure, while residue alphabets carry separate higher-order symbolic memory.

The next code step should be v0.7.1 hygiene plus a narrow 2M/5M replication of the top law candidates. The next formal step is to write the clock_loglog_markov2 transition law and the gap_mod6 variable_markov residue law as compact mathematical objects, then test whether either connects to prime-counting error or an RH-adjacent criterion.

Appendix A – E1 Result Tables#

Tables are shown in Section 6 above.

Appendix B – Interpretation Ladder#

LevelClaimStatus
L1Prime-gap order compresses better than shuffled gaps✅ supported by E1
L2Signal is not only marginal distribution✅ supported by E1
L3Signal survives wheel-aware controls✅ supported by final E2/SEA batch
L4Signal survives Markov-preserving controls✅ supported by final E2/SEA batch
L5Signal scales with prime depth✅ supported through 100k → 500k → 1M → 2M for raw gap under Markov/wheel controls
L6Signal relates to zeta-zero statistics🔮 speculative
L7DCC edge-of-chaos explains critical line🔮 speculative
L8AC/Zero Framework explains RH truth/proof difficulty🔮 highly speculative
L9Zero Framework at axiom level: \(0 \ne \emptyset\) as a possible foundational distinction🔮 philosophical hypothesis, pre-formal
L10Compact generator beats the Markov2 holdout floor✅ empirically achieved in v0.4 across 9/9 tested encodings; still not a formal invariant or proof
L11Generator gain survives scaling, ablation, and independent seeds✅ partially supported by v0.6 150k→500k→1M sequence; carrier not yet uniquely identified
L12Carrier is isolated as pure density, generic clock, residual, or DCC🧪 v0.6 narrows it to scale-conditioned transition structure; density and clock-DCC both active
L13Generator/invariant connects to prime-counting error, explicit formula residuals, or RH-equivalent criterion🔮 future formal bridge
L14Carrier stress separates prime-specific density from generic clock/scale state🧪 v0.7: raw-gap winner is clock_loglog_markov2; exact density is not isolated

Appendix C – Working Language Rules#

Use: “compression-detectable structure”, “order-sensitive signal”, “generator candidate”, “invariant candidate”, “holdout validation”, “preliminary empirical result”, “null controls”, “not an RH proof”. Avoid: “RH confirmed”, “proof”, “annihilation of zeta zeros”, “AC requires RH”, “generator proves RH”.

Appendix D – Methodology & Reproducibility#

D1. E1 Shuffle Protocol#

For each analysis window independently, the exact multiset of gap tokens inside that window is shuffled. This preserves the local gap distribution of that window and destroys only the ordering inside the window.

This matters because the E1 claim is not that prime gaps have a special distribution. The claim is narrower: the real order of those same gaps is more compressible than shuffled order.

D2. Windowing#

The final SEA batch contains non-overlapping main windows: 10 windows at 100k, 25 windows at 500k, 50 windows at 1M, and 50 windows at 2M.

D3. Encodings#

Each encoding maps the prime-gap sequence into a token sequence before LZ76 is computed:

D3b. LZ76 Implementation Note#

LZ76 is computed as a phrase-count complexity measure over token sequences, not as compressed byte size. Each encoding first maps the prime-gap sequence into symbolic integer tokens. The custom parser then scans the token stream left-to-right and counts the number of new phrases needed to parse it. Lower phrase count means the sequence is more compressible under this LZ76-style dictionary construction.

phrases = {}
i = 0
while i < len(tokens):
    extend the current phrase until it is new
    add phrase to dictionary
    phrase_count += 1
    i = next unread token

This is sufficient for the current signal tests, but future reproducibility releases should include the exact parser code, fixed test vectors, and a comparison with byte-level compressors.

D4. Main Statistic#

The primary statistic is:

Delta_LZ = LZ76(real) - mean(LZ76(control))

A negative value means the real ordered sequence is more compressible than the control sequences.

D5. Permutation p-value#

The permutation p-value is computed as:

p = (1 + count(control_LZ <= real_LZ)) / (trials + 1)

The minimum possible p-value depends on trial count: with 100 trials it is about 0.0099, with 300 trials about 0.00332, with 700 trials about 0.00143, with 1000 trials about 0.000999, and with 2000 trials about 0.0005. Low p-values here should still be read as sampled-control dominance, not as a final theorem-level significance claim.

D6. Known Likely Sources of Signal#

The E1 compression advantage may partly reflect known arithmetic constraints and scale effects, including:

This is why E2 adds block, wheel-aware, Markov-preserving, Cramér-like, and depth/scaling controls. The purpose is to separate obvious arithmetic structure from deeper sequential compression structure.

D7. Current Limitation#

The current result is a completed first-pass E2/SEA signal plus v0.3–v0.7 generator-carrier evidence, not a final theorem. Raw gap defeats shuffle, wheel6, Markov1, and Cramér-like controls but still hits Markov2 as the hard LZ-control boundary. The generator layer now beats Markov2 on holdout at 1M, and v0.7 shows that the raw-gap carrier is not clean pure density: generic loglog scale-clock conditioning currently wins the split ladder. The work still needs a v0.7.1 law-sheet hygiene patch, focused 2M/5M replication, independent code review, exact parser test vectors, and a formal bridge to prime-counting error or an RH-equivalent criterion.

D8. Arena Version Notes#

Appendix E – Next Update Slot#

Next: Patch v0.7.1 law-sheet split naming, then run a narrow 2M/5M focused replication for clock_loglog_markov2, clock_log_markov2, density_wrong_scale_markov2, density_dcc_no_motif, density_markov2, density_shuffled_markov2, clock_dcc_no_motif, variable_markov, and markov3. Do not widen until the log/loglog-clock law and the residue variable-Markov law are written as compact candidate invariants.


Version 0.12 · May 12, 2026 · v0.7 carrier-stress integrated · raw-gap Markov2 sensor boundary preserved · Markov2 holdout floor beaten again at 1M · carrier narrowed from density-scale to generic log/loglog scale-clock transition structure · no RH proof claim · next: v0.7.1 law-sheet patch and 2M/5M focused replication.