AIM³ Institute · Ljubljana · March 2026

8Z-AC "Classical Max"

Five-phase roadmap to close the OptimFROG gap — C1 through C5

Author Bojan Dobrečevič + Claude Opus 4.6
Status Peer Review Submitted
Date 2026-03-09

01Where We Stand: March 2026

-14.7 KB
avFLAC BEATS lax_t6 (Abyssal)
+7 KB
avFLAC vs lax_t6 (Ethereal)
8 / 18
Files Beating Best Baseline
3 / 18
Files Beating OptimFROG (!)

Two encoder families are now validated across 18 files (3 full tracks + 15 clips): aFLAC/avFLAC (valid .flac output, beats lax_t6 on Abyssal and 8 clips) and AC (custom .8za format, MDL frame probe). avFLAC beats OptimFROG on 3 Radiohead clips (up to 9.2% smaller) by exploiting per-segment block size optimization that OFR's fixed architecture cannot match. The .8za codec is where the deeper gains live — unconstrained by FLAC spec, free to use any predictor, any entropy coder, any block size.

CodecEthereal ArcRatioAbyssalRatiovs lax_t6
OptimFROG7,530,38343.79%16,283,00047.05%Target
WavPack -hh7,934,86246.14%17,392,33450.26%-0.2 to +0.4%
lax_t67,951,58946.24%17,320,22450.05%baseline
8Z-avFLAC v1.27,958,63146.28%17,305,55750.01%+7 KB / -14.7 KB
8Z-AC v1.9~7,993,00046.48%17,512,42250.61%MDL frame probe
FLAC -12 (max)7,996,12146.49%17,445,19950.41%old baseline
The honest gap: OptimFROG leads by ~6% on both songs. This gap is consistent — 435 KB on Ethereal Arc, 1,028 KB on Abyssal. Research shows ~60% of this gap is prediction quality (what we model) and ~40% is entropy coding (how we encode residuals). The "Classical Max" roadmap attacks both sides.

02The Five-Phase Roadmap

Each phase is implemented in a separate chat session using its own CONTINUE paper. Each phase is tested and validated before the next begins. MDL is the arbiter — new techniques compete in the arena; they don't replace existing winners.

C1
rANS Entropy
C2
OLS+NLMS Predictor
C3
Joint Stereo
C4+C5
Bitplane + DDS
PhaseTechniqueExpected GainEffortCONTINUE Paper
C1rANS entropy coder (MDL arena)2–4%2–3 weeksCONTINUE_AC_C1_rANS.md
C2OLS+NLMS cascade predictor2–3%2–3 weeksCONTINUE_AC_C2_OLS_NLMS.md
C3Joint stereo prediction1–1.5%1–2 weeksCONTINUE_AC_C3_Joint_Stereo.md
C4Bitplane coder with SSE0.5–1%2 weeksCONTINUE_AC_C4_Bitplane_SSE.md
C5DDS meta-parameter optimization0.5–1%1 weekCONTINUE_AC_C5_DDS_Optimization.md
Gains compound but not linearly. Better prediction (C2) makes entropy coding (C1) less impactful because residuals are already smaller. Real cumulative gain is likely 70–80% of the sum. Conservative estimate: 70% × 10.5% = ~7.3% total improvement — exceeding the 6% OptimFROG gap.

03Phase C1: rANS Entropy Coder

📄 CONTINUE_AC_C1_rANS.md (434 lines)
C1 — rANS Joins the MDL Arena Alongside Rice + LZMA

We do NOT replace Rice with rANS. We ADD rANS as a third entropy backend. Per partition, all three compete and MDL picks the cheapest. Rice wins clean tonal partitions (zero overhead). rANS wins mixed/transient partitions where Rice's integer k-parameter can't capture the true distribution shape.

Key Design Choices

Per-Partition Entropy Selection

Different partitions within the same frame may use different entropy coders. A frame might have Rice on the first partition (tonal, clean Laplace) and rANS on the third (transient, heavy tails). MDL decides per partition, not per frame.

rANS Specification

Precision: M=4096 (12-bit frequency tables). State: 32-bit with 16-bit renormalization. Probability model: Laplace shape parameter (1 byte) when distribution fits, full RLE histogram when it doesn't.

Side-info cost: ~16–32 bits per partition vs Rice's 4 bits. rANS has a 12–28 bit handicap — must save more on residual encoding to justify itself. MDL handles this automatically.

Per partition in each subframe:
  Rice encoded size  → X bytes  (4 bits side-info: k parameter)
  rANS encoded size  → Y bytes  (16-32 bits: distribution descriptor)
  LZMA encoded size  → Z bytes  (existing fallback)
  MDL winner = min(X + overhead_X, Y + overhead_Y, Z + overhead_Z)

Expected gain: 2–4% on mixed/transient content. ~0% on clean tonal (Rice wins there). On Ethereal Arc, expect ~60 KB improvement; on Abyssal, ~340–680 KB.

04Phase C2: OLS+NLMS Cascade Predictor

📄 CONTINUE_AC_C2_OLS_NLMS.md (420 lines)
C2 — The Biggest Single Improvement

The encode.su community consensus: "Most of the compression of OptimFROG and SAC comes from low-order OLS." The predictor matters more than the entropy coder. SAC (MIT-licensed) uses OLS+NLMS and achieves compression within ~1% of OptimFROG.

Why OLS Beats Standard LPC

Standard LPC (Levinson-Durbin, what we use now): computes autocorrelation, solves Toeplitz system. Optimizes for the signal's average statistical properties, assuming stationarity.

OLS (Ordinary Least Squares): forms the actual sample matrix, solves X'Xβ = X'y via Cholesky decomposition. Optimizes for what actually happened in this specific block. No stationarity assumption.

On non-stationary audio (which is most music), OLS wins because it adapts to local signal behavior rather than averaging over the block.

The NLMS Cascade

Two-Stage Prediction

Stage 1: OLS predictor → residuals_1 (better than LPC residuals)

Stage 2: NLMS adaptive filter on residuals_1 → residuals_2 (catches time-varying patterns OLS missed)

Key advantage: NLMS needs NO side-info. It's a causal filter — the decoder runs identical NLMS on the same residual stream, producing identical adapted weights. Free compression gain.

Fixed-Point Determinism

OLS computed in float64, then quantized to fixed-point (same qlevel path as existing LPC). Decoder uses quantized coefficients — perfectly deterministic, platform-independent. Same principle FLAC uses, just OLS instead of autocorrelation.

Expected gain: 2–3%. Combined with C1: 4–7% cumulative. OLS wins ~30–60% of frames; LPC wins the rest. MDL selects per-frame.

05Phase C3: Joint Stereo Prediction

📄 CONTINUE_AC_C3_Joint_Stereo.md (275 lines)
C3 — Generalized Stereo Decorrelation (Ghido, IEEE 2003)

Current approach: transform L/R to Mid/Side, then predict each channel independently. This captures instantaneous correlation but misses delayed cross-channel correlation (room acoustics, mic spacing) and frequency-dependent correlation (mono bass, stereo treble).

Joint prediction: Each channel uses past samples from BOTH channels as prediction inputs. For left channel: 16 taps from L (same as now) + 8 taps from R (new). The OLS sample matrix simply gets more columns — no new algorithm, same Cholesky solver.

6 stereo modes compete per frame: Independent, Mid/Side, Left-Side, Right-Side, Joint Predict, Joint Mid/Side. MDL picks the cheapest. Cross-channel taps cost ~24 bytes extra per frame in coefficients — must save more than that on residuals to justify.

Expected gain: 1–1.5%. Ghido's original paper reported ~1.5%. Cumulative C1+C2+C3: ~5–8.5%.

06Phase C4: Bitplane Coder with SSE

📄 CONTINUE_AC_C4_Bitplane_SSE.md (264 lines)
C4 — Context-Dependent Bit-Level Coding

Rice and rANS treat each symbol independently. A bitplane coder decomposes residuals into individual bits (MSB to LSB) and models each bit conditionally on context: upper bits already coded, magnitude of neighboring samples, channel correlation. This exploits clustering — large residuals follow large residuals — that symbol-level coders miss.

SSE (Secondary Symbol Estimation) refines the primary context model's probability estimates using recent coding history. This is what separates SAC/PAQ-class compressors from simpler approaches.

Fourth entropy backend in MDL arena: Rice, rANS, Bitplane, LZMA all compete per partition. Bitplane wins on ~20–40% of partitions — the hardest, most non-stationary ones.

Expected gain: 0.5–1% over C1's rANS. Cumulative C1–C4: ~5.5–9.5%.

07Phase C5: DDS Meta-Parameter Optimization

📄 CONTINUE_AC_C5_DDS_Optimization.md (311 lines)
C5 — The Exhaustive Last Mile

After C1–C4, the per-frame parameter space is ~15 million combinations. Grid search samples a tiny fraction. DDS (Dynamically Dimensioned Search) is a black-box optimizer that finds near-optimal configurations in 200–500 evaluations. It automatically transitions from broad exploration to local refinement — no tuning parameters except evaluation budget.

DCC + DDS integration: DCC assigns difficulty tier, DDS refines within the effort budget. SILENCE/EASY frames skip DDS (grid search sufficient). MEDIUM gets 100 iterations. HARD gets 300–500. Warm-started from grid search best — DDS only needs to improve an already-good starting point.

SAC uses DDS for per-frame parameter optimization. Proven technique for audio compression.

Expected gain: 0.5–1%. The diminishing-returns scraper. Cumulative C1–C5: ~6–10.5%.

08Implementation Timeline

C1: rANS Entropy Coder (2–3 weeks)
Add rANS to MDL arena alongside Rice + LZMA. Per-partition selection. Format version bump.

Start from 8Z-AC.py v1.9 (MDL frame probe). Add estimate_rans_cost(), rans_encode_partition(), rans_decode_partition(). Test: compression must improve on both Ethereal Arc and Abyssal. No regression on any frame (MDL fallback to Rice).

C2: OLS+NLMS Cascade Predictor (2–3 weeks)
OLS via Cholesky + NLMS adaptive filter. Fixed-point coefficients. DCC-gated depth.

Add compute_ols_coefficients(), nlms_cascade(). New predictor types 3 (OLS) and 4 (OLS_NLMS). OLS tries orders {4, 8, 12, 16, 20, 24, 32}. NLMS: 16-tap, mu∈{0.3, 0.5, 1.0}. Test: OLS should win 30–60% of frames.

C3: Joint Stereo Prediction (1–2 weeks)
Cross-channel OLS taps. 6 stereo modes in MDL arena. Causal decoder (no circular dependency).

Extend OLS sample matrix with cross-channel columns. Q∈{0, 4, 8, 12}. New stereo modes: JOINT_PREDICT, JOINT_MID_SIDE. Test: joint modes should win on spatially complex content. No gain expected on mono-ish tracks.

C4: Bitplane Coder with SSE (2 weeks)
Context-dependent bit-level coding. 4096 context bins. Binary ANS. Adaptive from scratch per frame.

Fourth entropy backend in MDL arena. MSB-to-LSB processing with context model. SSE corrects probability estimates. Test: bitplane should win on 20–40% of hardest partitions.

C5: DDS Meta-Parameter Optimization (1 week)
Dynamically Dimensioned Search over full parameter space. DCC-gated budget. Deterministic (fixed seed).

200–500 evaluations per frame on MEDIUM/HARD content. Warm-start from grid search best. Test: DDS must improve over grid search on at least 20% of frames. Total encode time target: 3–5 minutes per song.

09Expected Performance Outcomes

ConfigurationEst. Ethereal ArcRatiovs FLAC-12vs OFR
AC v1.9 (current)~7,993,000~46.48%≈ 0%+6.1%
+ C1 (rANS)~7,810,000~45.41%−2.3%+3.7%
+ C1 + C2 (OLS+NLMS)~7,570,000~44.02%−5.3%+0.5%
+ C1–C3 (Joint Stereo)~7,460,000~43.38%−6.7%−0.9%
+ C1–C5 (Full Classical Max)~7,350,000~42.74%−8.1%−2.4%
OptimFROG (reference)7,530,38343.79%0%
If estimates hold: C1+C2 alone puts us within striking distance of OptimFROG (+0.5%). C1–C3 should surpass it. C4+C5 provide margin. These are conservative estimates at 70% of theoretical maximums — the real outcome depends on how well techniques interact on actual audio content.

10Risk Matrix

RiskProbImpactMitigation
rANS overhead exceeds savings on most framesLowLowMDL gating — Rice always available as fallback
OLS FP determinism issues across platformsMediumHighFixed-point coefficients (same path as LPC qlevel)
NLMS convergence too slow for short framesMediumMediumDCC gates: skip cascade on SILENCE/EASY frames
Joint stereo coefficient overhead > savingsMediumLowMDL ensures no regression — falls back to M/S or independent
Bitplane context model doesn't converge in one frameMediumMediumAdaptive from scratch; warmup cost ~0.5% on 4096-sample frames
DDS adds significant encode time for marginal gainMediumLowDCC-gated: DDS only on MEDIUM/HARD frames
Gain estimates too optimistic (don't stack linearly)HighMediumApplied 70% discount. Each phase independently valuable.

11The Core Principles

Rules That Don't Change

MDL is non-negotiable. Every technique competes. MDL decides. No hardcoded preferences. If OLS doesn't beat LPC on a frame, LPC wins that frame.

Lossless is non-negotiable. Byte-for-byte reconstruction, SHA3-verified. Fixed-point coefficients ensure determinism.

One phase at a time. Test, validate, measure. Each CONTINUE paper is a self-contained session. Don't start C2 until C1 is validated.

Python first. R&D mode. Speed optimization (C extension, Cython) comes after correctness.

No backward compatibility. Format version bumps per phase. Old files stay readable.

Test on both songs. Ethereal Arc AND Abyssal. Plus 13-clip benchmark. Plus 30-song curator set when available.

12Parallel Track: xFLAC (FLAC-Format Improvements)

Two tracks, one engine. The AC "Classical Max" track (this document) works on the .8za custom format — unconstrained, maximum compression. In parallel, the xFLAC track improves our FLAC-format encoders (aFLAC, vFLAC, avFLAC) within FLAC spec constraints.
ImprovementTargetCONTINUE Paper
xFLAC-1: Smarter transient detection & block splitting0.3–0.5%CONTINUE_xFLAC_Improvements.md
xFLAC-2: LPC search expansion (TOP_K, windows)0.3–0.5%
xFLAC-3: Segment boundary optimization (stitching overhead)0.2–0.5%
vFLAC --fast / --full adaptive searchSpeed controlCONTINUE_vFLAC_FastFull.md ✅ Done
C1 → C2 → C3 → C4 → C5
One phase at a time. Test. Validate. Advance. Let the math decide.
               8Z-AC "Classical Max" — Phase Dependencies

  ┌─────────────────────────────────────────────────────────┐
  │                    CURRENT: v1.7H3                       │
  │   LPC (Levinson-Durbin) + Rice + LZMA + DCC + MDL      │
  └──────────────────────┬──────────────────────────────────┘
                         │
                         ▼
  ┌──────────────────────────────────────────────────────────┐
  │  C1: rANS Entropy    │  + rANS per-partition selection   │
  │  (2–4% gain)         │  Rice stays in arena              │
  └──────────────────────┬───────────────────────────────────┘
                         │
                         ▼
  ┌──────────────────────────────────────────────────────────┐
  │  C2: OLS+NLMS        │  + OLS predictor (fixed-point)    │
  │  (2–3% gain)         │  + NLMS cascade (no side-info)    │
  │  BIGGEST SINGLE GAIN │  LPC stays in arena               │
  └──────────────────────┬───────────────────────────────────┘
                         │
                         ▼
  ┌──────────────────────────────────────────────────────────┐
  │  C3: Joint Stereo    │  + cross-channel OLS taps         │
  │  (1–1.5% gain)       │  + 6 stereo modes in MDL arena    │
  └──────────────────────┬───────────────────────────────────┘
                         │
                         ▼
  ┌──────────────────────────────────────────────────────────┐
  │  C4: Bitplane + SSE  │  + context-dependent bit coding   │
  │  (0.5–1% gain)       │  + 4th entropy backend in MDL     │
  └──────────────────────┬───────────────────────────────────┘
                         │
                         ▼
  ┌──────────────────────────────────────────────────────────┐
  │  C5: DDS Optimizer   │  + black-box parameter search     │
  │  (0.5–1% gain)       │  + DCC-gated evaluation budget    │
  └──────────────────────┬───────────────────────────────────┘
                         │
                         ▼
               ┌───────────────────┐
               │   TARGET: ≤ OFR   │
               │   (~43.8% ratio)  │
               └───────────────────┘