8Z-AC v2.0 "Classical Max" — Five-Phase Roadmap

01Where We Stand: March 2026

-14.7 KB

avFLAC BEATS lax_t6 (Abyssal)

+7 KB

avFLAC vs lax_t6 (Ethereal)

8 / 18

Files Beating Best Baseline

3 / 18

Files Beating OptimFROG (!)

Two encoder families are now validated across 18 files (3 full tracks + 15 clips): aFLAC/avFLAC (valid .flac output, beats lax_t6 on Abyssal and 8 clips) and AC (custom .8za format, MDL frame probe). avFLAC beats OptimFROG on 3 Radiohead clips (up to 9.2% smaller) by exploiting per-segment block size optimization that OFR's fixed architecture cannot match. The .8za codec is where the deeper gains live — unconstrained by FLAC spec, free to use any predictor, any entropy coder, any block size.

Codec	Ethereal Arc	Ratio	Abyssal	Ratio	vs lax_t6
OptimFROG	7,530,383	43.79%	16,283,000	47.05%	Target
WavPack -hh	7,934,862	46.14%	17,392,334	50.26%	-0.2 to +0.4%
lax_t6	7,951,589	46.24%	17,320,224	50.05%	baseline
8Z-avFLAC v1.2	7,958,631	46.28%	17,305,557	50.01%	+7 KB / -14.7 KB
8Z-AC v1.9	~7,993,000	46.48%	17,512,422	50.61%	MDL frame probe
FLAC -12 (max)	7,996,121	46.49%	17,445,199	50.41%	old baseline

The honest gap: OptimFROG leads by ~6% on both songs. This gap is consistent — 435 KB on Ethereal Arc, 1,028 KB on Abyssal. Research shows ~60% of this gap is prediction quality (what we model) and ~40% is entropy coding (how we encode residuals). The "Classical Max" roadmap attacks both sides.

02The Five-Phase Roadmap

Each phase is implemented in a separate chat session using its own CONTINUE paper. Each phase is tested and validated before the next begins. MDL is the arbiter — new techniques compete in the arena; they don't replace existing winners.

rANS Entropy

OLS+NLMS Predictor

Joint Stereo

C4+C5

Bitplane + DDS

Phase	Technique	Expected Gain	Effort	CONTINUE Paper
C1	rANS entropy coder (MDL arena)	2–4%	2–3 weeks	CONTINUE_AC_C1_rANS.md
C2	OLS+NLMS cascade predictor	2–3%	2–3 weeks	CONTINUE_AC_C2_OLS_NLMS.md
C3	Joint stereo prediction	1–1.5%	1–2 weeks	CONTINUE_AC_C3_Joint_Stereo.md
C4	Bitplane coder with SSE	0.5–1%	2 weeks	CONTINUE_AC_C4_Bitplane_SSE.md
C5	DDS meta-parameter optimization	0.5–1%	1 week	CONTINUE_AC_C5_DDS_Optimization.md

Gains compound but not linearly. Better prediction (C2) makes entropy coding (C1) less impactful because residuals are already smaller. Real cumulative gain is likely 70–80% of the sum. Conservative estimate: 70% × 10.5% = ~7.3% total improvement — exceeding the 6% OptimFROG gap.

03Phase C1: rANS Entropy Coder

📄 CONTINUE_AC_C1_rANS.md (434 lines)

C1 — rANS Joins the MDL Arena Alongside Rice + LZMA

We do NOT replace Rice with rANS. We ADD rANS as a third entropy backend. Per partition, all three compete and MDL picks the cheapest. Rice wins clean tonal partitions (zero overhead). rANS wins mixed/transient partitions where Rice's integer k-parameter can't capture the true distribution shape.

Key Design Choices

Per-Partition Entropy Selection

Different partitions within the same frame may use different entropy coders. A frame might have Rice on the first partition (tonal, clean Laplace) and rANS on the third (transient, heavy tails). MDL decides per partition, not per frame.

rANS Specification

Precision: M=4096 (12-bit frequency tables). State: 32-bit with 16-bit renormalization. Probability model: Laplace shape parameter (1 byte) when distribution fits, full RLE histogram when it doesn't.

Side-info cost: ~16–32 bits per partition vs Rice's 4 bits. rANS has a 12–28 bit handicap — must save more on residual encoding to justify itself. MDL handles this automatically.

Per partition in each subframe:
  Rice encoded size  → X bytes  (4 bits side-info: k parameter)
  rANS encoded size  → Y bytes  (16-32 bits: distribution descriptor)
  LZMA encoded size  → Z bytes  (existing fallback)
  MDL winner = min(X + overhead_X, Y + overhead_Y, Z + overhead_Z)

Expected gain: 2–4% on mixed/transient content. ~0% on clean tonal (Rice wins there). On Ethereal Arc, expect ~60 KB improvement; on Abyssal, ~340–680 KB.

04Phase C2: OLS+NLMS Cascade Predictor

📄 CONTINUE_AC_C2_OLS_NLMS.md (420 lines)

C2 — The Biggest Single Improvement

The encode.su community consensus: "Most of the compression of OptimFROG and SAC comes from low-order OLS." The predictor matters more than the entropy coder. SAC (MIT-licensed) uses OLS+NLMS and achieves compression within ~1% of OptimFROG.

Why OLS Beats Standard LPC

Standard LPC (Levinson-Durbin, what we use now): computes autocorrelation, solves Toeplitz system. Optimizes for the signal's average statistical properties, assuming stationarity.

OLS (Ordinary Least Squares): forms the actual sample matrix, solves X'Xβ = X'y via Cholesky decomposition. Optimizes for what actually happened in this specific block. No stationarity assumption.

On non-stationary audio (which is most music), OLS wins because it adapts to local signal behavior rather than averaging over the block.

The NLMS Cascade

Two-Stage Prediction

Stage 1: OLS predictor → residuals_1 (better than LPC residuals)

Stage 2: NLMS adaptive filter on residuals_1 → residuals_2 (catches time-varying patterns OLS missed)

Key advantage: NLMS needs NO side-info. It's a causal filter — the decoder runs identical NLMS on the same residual stream, producing identical adapted weights. Free compression gain.

Fixed-Point Determinism

OLS computed in float64, then quantized to fixed-point (same qlevel path as existing LPC). Decoder uses quantized coefficients — perfectly deterministic, platform-independent. Same principle FLAC uses, just OLS instead of autocorrelation.

Expected gain: 2–3%. Combined with C1: 4–7% cumulative. OLS wins ~30–60% of frames; LPC wins the rest. MDL selects per-frame.

05Phase C3: Joint Stereo Prediction

📄 CONTINUE_AC_C3_Joint_Stereo.md (275 lines)

C3 — Generalized Stereo Decorrelation (Ghido, IEEE 2003)

Current approach: transform L/R to Mid/Side, then predict each channel independently. This captures instantaneous correlation but misses delayed cross-channel correlation (room acoustics, mic spacing) and frequency-dependent correlation (mono bass, stereo treble).

Joint prediction: Each channel uses past samples from BOTH channels as prediction inputs. For left channel: 16 taps from L (same as now) + 8 taps from R (new). The OLS sample matrix simply gets more columns — no new algorithm, same Cholesky solver.

6 stereo modes compete per frame: Independent, Mid/Side, Left-Side, Right-Side, Joint Predict, Joint Mid/Side. MDL picks the cheapest. Cross-channel taps cost ~24 bytes extra per frame in coefficients — must save more than that on residuals to justify.

Expected gain: 1–1.5%. Ghido's original paper reported ~1.5%. Cumulative C1+C2+C3: ~5–8.5%.

06Phase C4: Bitplane Coder with SSE

📄 CONTINUE_AC_C4_Bitplane_SSE.md (264 lines)

C4 — Context-Dependent Bit-Level Coding

Rice and rANS treat each symbol independently. A bitplane coder decomposes residuals into individual bits (MSB to LSB) and models each bit conditionally on context: upper bits already coded, magnitude of neighboring samples, channel correlation. This exploits clustering — large residuals follow large residuals — that symbol-level coders miss.

SSE (Secondary Symbol Estimation) refines the primary context model's probability estimates using recent coding history. This is what separates SAC/PAQ-class compressors from simpler approaches.

Fourth entropy backend in MDL arena: Rice, rANS, Bitplane, LZMA all compete per partition. Bitplane wins on ~20–40% of partitions — the hardest, most non-stationary ones.

Expected gain: 0.5–1% over C1's rANS. Cumulative C1–C4: ~5.5–9.5%.

07Phase C5: DDS Meta-Parameter Optimization

📄 CONTINUE_AC_C5_DDS_Optimization.md (311 lines)

C5 — The Exhaustive Last Mile

After C1–C4, the per-frame parameter space is ~15 million combinations. Grid search samples a tiny fraction. DDS (Dynamically Dimensioned Search) is a black-box optimizer that finds near-optimal configurations in 200–500 evaluations. It automatically transitions from broad exploration to local refinement — no tuning parameters except evaluation budget.

DCC + DDS integration: DCC assigns difficulty tier, DDS refines within the effort budget. SILENCE/EASY frames skip DDS (grid search sufficient). MEDIUM gets 100 iterations. HARD gets 300–500. Warm-started from grid search best — DDS only needs to improve an already-good starting point.

SAC uses DDS for per-frame parameter optimization. Proven technique for audio compression.

Expected gain: 0.5–1%. The diminishing-returns scraper. Cumulative C1–C5: ~6–10.5%.

08Implementation Timeline

C1: rANS Entropy Coder (2–3 weeks)

Add rANS to MDL arena alongside Rice + LZMA. Per-partition selection. Format version bump.

Start from 8Z-AC.py v1.9 (MDL frame probe). Add estimate_rans_cost(), rans_encode_partition(), rans_decode_partition(). Test: compression must improve on both Ethereal Arc and Abyssal. No regression on any frame (MDL fallback to Rice).

C2: OLS+NLMS Cascade Predictor (2–3 weeks)

OLS via Cholesky + NLMS adaptive filter. Fixed-point coefficients. DCC-gated depth.

Add compute_ols_coefficients(), nlms_cascade(). New predictor types 3 (OLS) and 4 (OLS_NLMS). OLS tries orders {4, 8, 12, 16, 20, 24, 32}. NLMS: 16-tap, mu∈{0.3, 0.5, 1.0}. Test: OLS should win 30–60% of frames.

C3: Joint Stereo Prediction (1–2 weeks)

Cross-channel OLS taps. 6 stereo modes in MDL arena. Causal decoder (no circular dependency).

Extend OLS sample matrix with cross-channel columns. Q∈{0, 4, 8, 12}. New stereo modes: JOINT_PREDICT, JOINT_MID_SIDE. Test: joint modes should win on spatially complex content. No gain expected on mono-ish tracks.

C4: Bitplane Coder with SSE (2 weeks)

Context-dependent bit-level coding. 4096 context bins. Binary ANS. Adaptive from scratch per frame.

Fourth entropy backend in MDL arena. MSB-to-LSB processing with context model. SSE corrects probability estimates. Test: bitplane should win on 20–40% of hardest partitions.

C5: DDS Meta-Parameter Optimization (1 week)

Dynamically Dimensioned Search over full parameter space. DCC-gated budget. Deterministic (fixed seed).

200–500 evaluations per frame on MEDIUM/HARD content. Warm-start from grid search best. Test: DDS must improve over grid search on at least 20% of frames. Total encode time target: 3–5 minutes per song.

09Expected Performance Outcomes

Configuration	Est. Ethereal Arc	Ratio	vs FLAC-12	vs OFR
AC v1.9 (current)	~7,993,000	~46.48%	≈ 0%	+6.1%
+ C1 (rANS)	~7,810,000	~45.41%	−2.3%	+3.7%
+ C1 + C2 (OLS+NLMS)	~7,570,000	~44.02%	−5.3%	+0.5%
+ C1–C3 (Joint Stereo)	~7,460,000	~43.38%	−6.7%	−0.9%
+ C1–C5 (Full Classical Max)	~7,350,000	~42.74%	−8.1%	−2.4%
OptimFROG (reference)	7,530,383	43.79%	—	0%

If estimates hold: C1+C2 alone puts us within striking distance of OptimFROG (+0.5%). C1–C3 should surpass it. C4+C5 provide margin. These are conservative estimates at 70% of theoretical maximums — the real outcome depends on how well techniques interact on actual audio content.

10Risk Matrix

Risk	Prob	Impact	Mitigation
rANS overhead exceeds savings on most frames	Low	Low	MDL gating — Rice always available as fallback
OLS FP determinism issues across platforms	Medium	High	Fixed-point coefficients (same path as LPC qlevel)
NLMS convergence too slow for short frames	Medium	Medium	DCC gates: skip cascade on SILENCE/EASY frames
Joint stereo coefficient overhead > savings	Medium	Low	MDL ensures no regression — falls back to M/S or independent
Bitplane context model doesn't converge in one frame	Medium	Medium	Adaptive from scratch; warmup cost ~0.5% on 4096-sample frames
DDS adds significant encode time for marginal gain	Medium	Low	DCC-gated: DDS only on MEDIUM/HARD frames
Gain estimates too optimistic (don't stack linearly)	High	Medium	Applied 70% discount. Each phase independently valuable.

11The Core Principles

Rules That Don't Change

MDL is non-negotiable. Every technique competes. MDL decides. No hardcoded preferences. If OLS doesn't beat LPC on a frame, LPC wins that frame.

Lossless is non-negotiable. Byte-for-byte reconstruction, SHA3-verified. Fixed-point coefficients ensure determinism.

One phase at a time. Test, validate, measure. Each CONTINUE paper is a self-contained session. Don't start C2 until C1 is validated.

Python first. R&D mode. Speed optimization (C extension, Cython) comes after correctness.

No backward compatibility. Format version bumps per phase. Old files stay readable.

Test on both songs. Ethereal Arc AND Abyssal. Plus 13-clip benchmark. Plus 30-song curator set when available.

12Parallel Track: xFLAC (FLAC-Format Improvements)

Two tracks, one engine. The AC "Classical Max" track (this document) works on the .8za custom format — unconstrained, maximum compression. In parallel, the xFLAC track improves our FLAC-format encoders (aFLAC, vFLAC, avFLAC) within FLAC spec constraints.

Improvement	Target	CONTINUE Paper
xFLAC-1: Smarter transient detection & block splitting	0.3–0.5%	CONTINUE_xFLAC_Improvements.md
xFLAC-2: LPC search expansion (TOP_K, windows)	0.3–0.5%
xFLAC-3: Segment boundary optimization (stitching overhead)	0.2–0.5%
vFLAC --fast / --full adaptive search	Speed control	CONTINUE_vFLAC_FastFull.md ✅ Done

C1 → C2 → C3 → C4 → C5

One phase at a time. Test. Validate. Advance. Let the math decide.

               8Z-AC "Classical Max" — Phase Dependencies

  ┌─────────────────────────────────────────────────────────┐
  │                    CURRENT: v1.7H3                       │
  │   LPC (Levinson-Durbin) + Rice + LZMA + DCC + MDL      │
  └──────────────────────┬──────────────────────────────────┘
                         │
                         ▼
  ┌──────────────────────────────────────────────────────────┐
  │  C1: rANS Entropy    │  + rANS per-partition selection   │
  │  (2–4% gain)         │  Rice stays in arena              │
  └──────────────────────┬───────────────────────────────────┘
                         │
                         ▼
  ┌──────────────────────────────────────────────────────────┐
  │  C2: OLS+NLMS        │  + OLS predictor (fixed-point)    │
  │  (2–3% gain)         │  + NLMS cascade (no side-info)    │
  │  BIGGEST SINGLE GAIN │  LPC stays in arena               │
  └──────────────────────┬───────────────────────────────────┘
                         │
                         ▼
  ┌──────────────────────────────────────────────────────────┐
  │  C3: Joint Stereo    │  + cross-channel OLS taps         │
  │  (1–1.5% gain)       │  + 6 stereo modes in MDL arena    │
  └──────────────────────┬───────────────────────────────────┘
                         │
                         ▼
  ┌──────────────────────────────────────────────────────────┐
  │  C4: Bitplane + SSE  │  + context-dependent bit coding   │
  │  (0.5–1% gain)       │  + 4th entropy backend in MDL     │
  └──────────────────────┬───────────────────────────────────┘
                         │
                         ▼
  ┌──────────────────────────────────────────────────────────┐
  │  C5: DDS Optimizer   │  + black-box parameter search     │
  │  (0.5–1% gain)       │  + DCC-gated evaluation budget    │
  └──────────────────────┬───────────────────────────────────┘
                         │
                         ▼
               ┌───────────────────┐
               │   TARGET: ≤ OFR   │
               │   (~43.8% ratio)  │
               └───────────────────┘