8Z-Audio — Five Days Against FLAC

Chapter I · Cross-Domain Transfer

Not Built in Isolation

8Z-Audio is the third domain-specific compressor in the 8Z-LO (Lossless Optimized) framework, following 8Z-FASTA (genomic DNA sequences) and 8Z-TIF (satellite imagery). The project didn't start from an audio textbook — it started from a radical question: what if the same mathematical architecture that decodes DNA compresses music?

8Z-FASTA · DNA

The DNA Compressor

Before any audio code was written, the team ran a sanity check: convert Pink Floyd's "Shine On You Crazy Diamond" into DNA bases (A, C, G, T) and compress it with gemZ, the FASTA compressor built for genomic sequences.

gemZ's DNA models found zero hits — 100% RAW blocks. The biological pattern-matchers understood nothing about waveforms. But the architectural skeleton — MDL model battles, per-block predictor selection, residual coding — was exactly what audio needed.

8Z-DCC · Adaptation

The Digital Claustrum Controller

The DCC was born in the TSP solver, migrated into the FASTA encoder, and landed in audio. It's an adaptive search governor: a learned probability model that narrows the encoder's candidate search space in real time, spending compute budget where the signal is hardest and coasting where it's easy.

In v1.5, DCC settled for the first confirmed time on a real audio file (Pink Floyd 192kHz, u=10→15), proving the concept transfers across domains.

The 8Z framework rests on one principle: Minimum Description Length (MDL) model selection. Every bit saved must correspond to genuine reconstruction capability — not statistical correlation, not heuristic intuition. Each frame of audio is a competition. Multiple predictors battle; MDL picks the winner. FLAC runs one predictor (LPC) on every frame, always. That's the gap.

Chapter II · The Five-Day Sprint

v0.1 → v1.5

From an empty file to a compressor that beats FLAC at maximum compression. Every version had a lesson. Every regression had a diagnosis.

Day 1 · February 18

The FASTA Experiment & v0.1 — First Proof of Life

LPC Orders 1–32 DELTA Predictor Mid-Side Stereo SHA3-256 Verified No Rice Coding

The first working encoder was built in a single chat session. LPC via Levinson-Durbin autocorrelation, DELTA prediction ported from gemZ, MDL battle per frame, residual compression via int32 → LZMA, all four stereo decorrelation modes, bit-perfect verification.

9,417,674 bytes vs. FLAC 6,055,738 — architecture proven, Rice coding gap identified immediately

Day 2 · February 19

v1.0–v1.2 — Rice Coding & FLAC Territory

Golomb-Rice Entropy Partition Search QLP Precision Search 8–16 3 Window Functions

Implemented proper Golomb-Rice coding with partition search, QLP quantization precision search, and three apodization windows (Hann, Tukey, none). The encoder crossed into FLAC-5 territory for the first time.

Approached FLAC -5 compression on Pink Floyd 192kHz. The real FLAC gap isolated.

Day 3 · February 20

v1.3.1 — Multiprocessing & Best-Ever Compression

Python Multiprocessing Exhaustive Search ~600 Candidates/Frame 11-Hour Encode

Exhaustive combinatorial search across all candidate configurations — LPC orders, windows, QLP precision — evaluated per frame with MDL arbitration. Expensive. Brutally thorough. This became the permanent compression quality ceiling to beat.

5,104,935 bytes — beats FLAC -8 by 2.1%. First time a handwritten encoder surpassed FLAC's maximum default setting.

Day 4 · February 21

v1.4 — DCC Port from HYB4

Digital Claustrum Controller CodecLearner 7-bit DCC Events 1.53% Regression vs v1.3.1

The DCC search governor was ported from the HYB4 FASTA encoder. 7-bit event encoding was too fine-grained — the DCC never found a stable pattern to exploit. Encode time dropped to 102 minutes (from 11 hours), but compression regressed slightly. Lesson: DCC event granularity matters enormously.

Encode: 11 hr → 102 min. DCC concept proven but not yet settled. 7-bit events too noisy.

Day 5 · February 22 — The Benchmark

v1.5 — DCC Settles. Benchmark at Scale.

Blocksize 16384 4-bit DCC Events DCC Settled First Time 15-Clip Parallel Corpus 8 Workers · 75 min

Switching DCC from 7-bit to 4-bit events was the breakthrough: the controller finally settled on Pink Floyd 192kHz (u=10→15). Block size optimized to 16384. Scanner v1.2, clipper v1.2, and parallel pipeline v1.2 built in the same session. 15 clips across 10 songs encoded simultaneously.

v1.5 beats FLAC-12 on 7 of 15 clips. Best win: Lady Gaga "Die With A Smile" — 10.3% smaller than FLAC's best. All files lossless-verified via SHA3-256.

Chapter III · Architecture

What Makes It Different

FLAC uses one prediction model for every frame of audio. 8Z-Audio uses up to 600 candidates, evaluated by MDL, with an adaptive search governor that learns which candidates work for this particular signal.

8Z-Audio v1.5 Frame Pipeline

WAV Frame 16384 samples

→

DCC Governor Budget allocation

→

Candidates ~600 configs

→

MDL Battle True total cost

→

Best Predictor Rice-coded residuals

→

.8za Frame SHA3 verified

📐

LPC Orders 1–32

Vs. FLAC's 0–12. Higher orders capture longer-range correlations in complex harmonic content.

🪟

Three Window Functions

Hann, Tukey, and no apodization — evaluated per frame. FLAC uses no apodization at its default settings.

🎯

QLP Precision Search

Exhaustive search over 8–16 bits of quantization precision per frame. FLAC uses a fixed precision per level.

🧠

DCC + CodecLearner

The Digital Claustrum Controller learns per-signal statistics, adapting search depth to where the signal is hardest.

⚖️

MDL Model Selection

Every bit counts. MDL sees the true total cost including predictor overhead — no bit is "free."

🔁

All Stereo Modes

Mid-Side, Left-Side, Right-Side, and Independent — selected per frame by MDL, not globally per file.

Chapter IV · Results

The Benchmark

15 clips across 10 songs — selected by the 8Z Audio Scanner v1.2 to cover all difficulty categories. Encoded in parallel with 8 workers. All results verified lossless (SHA3-256). Compression ratio: lower is better (fraction of original size).

8Z-Audio v1.5 vs FLAC-12 · Compression Advantage per Clip

Clip

What the Data Revealed

1.9×

AI Audio Is More Predictable

Scanner data across 10 songs proves AI-generated audio (Producer.ai) has mean difficulty 0.25 vs. human recordings at 0.50 at matched sample rates. AI music is structurally simpler — more sustained tones, less transient chaos. Business implication: a specialized codec for AI audio platforms could achieve significantly better ratios than general-purpose lossless codecs.

u=10→15

DCC Settled — First Time

The Digital Claustrum Controller reached a stable learned state on Pink Floyd 192kHz (30s, 563 frames). The critical parameter: 4-bit events instead of 7-bit. The finer granularity of 7-bit encoding produced too much noise for the controller to find a pattern. Coarser events → cleaner signal → stable adaptation. Two-pass architecture will eliminate the sequential bottleneck entirely.

+39%

OptimFROG's Genre Weakness

Industrial music (Rammstein) consistently catastrophically fails in OptimFROG — 39.3% worse than FLAC on the hardest 10-second clip, 26–28% worse on the others. 8Z-Audio also loses on Rammstein, but far less badly (+3–8%). This is a specific architectural opportunity: if 8Z can handle industrial music better, it becomes a competitive differentiator in an otherwise mature field.

∀ domains

Cross-Domain Transfer Works

DCC: TSP solver → FASTA encoder → audio encoder. Scanner concept: DNA pipeline → audio difficulty pre-screener. MDL framework: 8Z image encoder → audio frame selection. The same architectural patterns — MDL arbitration, per-block model competition, adaptive search governors — produce gains across every domain they've been applied to.

100%

Perfect Category Win Rates

On tonal content: 3 for 3. On dynamic content: 1 for 1. On the "easiest" category: 1 for 1. On buildups: 1 for 1. 8Z-Audio's multi-model approach was purpose-built for structured content — and the benchmark confirmed it. The remaining losses are concentrated in industrial/transient content where LPC's weaknesses are known and targeted for v1.6.

≡ Frozen

The Field Is 18 Years Stale

Every major lossless audio codec — FLAC (2001), WavPack (2002), TAK (2007), OptimFROG, Monkey's Audio — uses LPC as its primary or sole prediction model. No production codec performs per-frame multi-model MDL selection, harmonic modeling, or periodic template matching. The field has been architecturally frozen since ~2007. 8Z-Audio is the first systematic attempt to break out of that constraint.

Chapter VI · What's Next

v1.6 — Two-Pass Architecture

The core innovation of v1.6 is separation of concerns: signal analysis (fast, sequential) is decoupled from encoding (slow, parallelizable). Scanner v1.2 becomes Pass 1. The encoder reads its output and processes all frames independently.

Priority 1 · Speed

Two-Pass Parallel

Pass 1 (Scanner): ~10s. Pass 2 (Encoder): independent frames, N workers.

40 min → 5 min target

Priority 2 · Adaptive

Adaptive Blocksize

Rate-matched block sizes: 16384 at 44.1kHz, 8192 at 96kHz, 4096 at 192kHz.

+1.6% on 192kHz content

Priority 3 · Hard Content

Rice Partition Opt.

Improved partition parameter search to close the gap on high-entropy frames.

+0.5–1% on transient/industrial

Priority 4 · Entropy

rANS Coding

Replace Rice on high-entropy residuals with range asymmetric numeral systems.

+0.5–1% on hard content

Priority 5 · Tonal

Periodic Predictor

Exploit sample repetition directly in sustained notes and synthesizer content.

+1–3% on tonal clips

Priority 6 · Novel

Harmonic Predictor

Sinusoidal modeling for harmonic content. No lossless audio codec currently does this.

+2–5% on tonal (projected)

v1.6 Targets

From the benchmark session post-mortem — everything needed to push beyond 50% win rate.

7 / 15 (47%)

10 / 15

Clips beating FLAC-12

10.3%

15%+

Best single-clip win

40 min (PF 192kHz)

5–10 min

Encode time target

75 min (14 clips, 8×)

15 min

Parallel pipeline

The Music

10 Songs, 6 Playable

Six original compositions by Bojan Dobrečevič, created via Producer.ai — used as the AI half of the 8Z-Audio benchmark corpus. Scanner data shows AI-generated audio is 1.9× more compressible than human recordings, making them ideal test subjects. Play the 10–30 second benchmark clip (WAV) or the full song (MP3).

▶ Select a song below to start listening

▶ Fractured Harmony

easiest tonal

8ZA vs FLAC-12: +5.1% win · +2.9% win

▶ Where Do I Go

dcc_stress

8ZA vs FLAC-12: −2.0%

▶ Awakening AI

tonal

8ZA vs FLAC-12: +1.5% win

▶ Echoes of the Shore

dcc_stress

8ZA vs FLAC-12: −1.8%

▶ Lacrimosa Requiem

diverse

8ZA vs FLAC-12: −0.2%

▶ Between the Strings

dcc_stress

8ZA vs FLAC-12: −1.3%

Reference Recordings — Copyright protected, no player

Pink Floyd

Shine On You Crazy Diamond

192 kHz · 24-bit progressive

8ZA: +3.0% (dynamic) · +5.2% (tonal)

Metallica

Lux Æterna

thrash metal 44.1 kHz

8ZA: +0.03% (buildup) · −0.2% (transient)

Lady Gaga

Die With A Smile

pop 44.1 kHz

8ZA: +10.3% ← best win overall

Rammstein

Du Hast

industrial 48 kHz

8ZA: −2.7% to −8.0%

OFR catastrophic: −26 to −39% on industrial

Five DaysAgainst FLAC

Not Built in Isolation

The DNA Compressor

The Digital Claustrum Controller

v0.1 → v1.5

The FASTA Experiment & v0.1 — First Proof of Life

v1.0–v1.2 — Rice Coding & FLAC Territory

v1.3.1 — Multiprocessing & Best-Ever Compression

v1.4 — DCC Port from HYB4

v1.5 — DCC Settles. Benchmark at Scale.

What Makes It Different

LPC Orders 1–32

Three Window Functions

QLP Precision Search

DCC + CodecLearner

MDL Model Selection

All Stereo Modes

The Benchmark

What the Data Revealed

AI Audio Is More Predictable

DCC Settled — First Time

OptimFROG's Genre Weakness

Cross-Domain Transfer Works

Perfect Category Win Rates

The Field Is 18 Years Stale

v1.6 — Two-Pass Architecture

Two-Pass Parallel

Adaptive Blocksize

Rice Partition Opt.

rANS Coding

Periodic Predictor

Harmonic Predictor

v1.6 Targets

10 Songs, 6 Playable

Five Days
Against FLAC