8Z-AUDIO
Origin Sprint Benchmark Songs Roadmap
Benchmark Evolution Appendix M Master Plan
Gemini GPT Grok
🌙
AIM³ Institute · Ljubljana · February 2026

Five Days
Against FLAC

How a lossless audio codec was built from scratch in a single sprint — and why it beats the 25-year-old gold standard on nearly half of all test clips.

5
Days of Development
7/15
Clips Beating FLAC‑12
10.3%
Best Single‑Clip Win
v0.1 → v1.5
Versions Shipped
Chapter I · Cross-Domain Transfer

Not Built in Isolation

8Z-Audio is the third domain-specific compressor in the 8Z-LO (Lossless Optimized) framework, following 8Z-FASTA (genomic DNA sequences) and 8Z-TIF (satellite imagery). The project didn't start from an audio textbook — it started from a radical question: what if the same mathematical architecture that decodes DNA compresses music?

8Z-FASTA · DNA

The DNA Compressor

Before any audio code was written, the team ran a sanity check: convert Pink Floyd's "Shine On You Crazy Diamond" into DNA bases (A, C, G, T) and compress it with gemZ, the FASTA compressor built for genomic sequences.

gemZ's DNA models found zero hits — 100% RAW blocks. The biological pattern-matchers understood nothing about waveforms. But the architectural skeleton — MDL model battles, per-block predictor selection, residual coding — was exactly what audio needed.

8Z-DCC · Adaptation

The Digital Claustrum Controller

The DCC was born in the TSP solver, migrated into the FASTA encoder, and landed in audio. It's an adaptive search governor: a learned probability model that narrows the encoder's candidate search space in real time, spending compute budget where the signal is hardest and coasting where it's easy.

In v1.5, DCC settled for the first confirmed time on a real audio file (Pink Floyd 192kHz, u=10→15), proving the concept transfers across domains.

The 8Z framework rests on one principle: Minimum Description Length (MDL) model selection. Every bit saved must correspond to genuine reconstruction capability — not statistical correlation, not heuristic intuition. Each frame of audio is a competition. Multiple predictors battle; MDL picks the winner. FLAC runs one predictor (LPC) on every frame, always. That's the gap.

Chapter II · The Five-Day Sprint

v0.1 → v1.5

From an empty file to a compressor that beats FLAC at maximum compression. Every version had a lesson. Every regression had a diagnosis.

Day 1 · February 18

The FASTA Experiment & v0.1 — First Proof of Life

LPC Orders 1–32 DELTA Predictor Mid-Side Stereo SHA3-256 Verified No Rice Coding

The first working encoder was built in a single chat session. LPC via Levinson-Durbin autocorrelation, DELTA prediction ported from gemZ, MDL battle per frame, residual compression via int32 → LZMA, all four stereo decorrelation modes, bit-perfect verification.

9,417,674 bytes vs. FLAC 6,055,738 — architecture proven, Rice coding gap identified immediately
Day 2 · February 19

v1.0–v1.2 — Rice Coding & FLAC Territory

Golomb-Rice Entropy Partition Search QLP Precision Search 8–16 3 Window Functions

Implemented proper Golomb-Rice coding with partition search, QLP quantization precision search, and three apodization windows (Hann, Tukey, none). The encoder crossed into FLAC-5 territory for the first time.

Approached FLAC -5 compression on Pink Floyd 192kHz. The real FLAC gap isolated.
Day 3 · February 20

v1.3.1 — Multiprocessing & Best-Ever Compression

Python Multiprocessing Exhaustive Search ~600 Candidates/Frame 11-Hour Encode

Exhaustive combinatorial search across all candidate configurations — LPC orders, windows, QLP precision — evaluated per frame with MDL arbitration. Expensive. Brutally thorough. This became the permanent compression quality ceiling to beat.

5,104,935 bytes — beats FLAC -8 by 2.1%. First time a handwritten encoder surpassed FLAC's maximum default setting.
Day 4 · February 21

v1.4 — DCC Port from HYB4

Digital Claustrum Controller CodecLearner 7-bit DCC Events 1.53% Regression vs v1.3.1

The DCC search governor was ported from the HYB4 FASTA encoder. 7-bit event encoding was too fine-grained — the DCC never found a stable pattern to exploit. Encode time dropped to 102 minutes (from 11 hours), but compression regressed slightly. Lesson: DCC event granularity matters enormously.

Encode: 11 hr → 102 min. DCC concept proven but not yet settled. 7-bit events too noisy.
Day 5 · February 22 — The Benchmark

v1.5 — DCC Settles. Benchmark at Scale.

Blocksize 16384 4-bit DCC Events DCC Settled First Time 15-Clip Parallel Corpus 8 Workers · 75 min

Switching DCC from 7-bit to 4-bit events was the breakthrough: the controller finally settled on Pink Floyd 192kHz (u=10→15). Block size optimized to 16384. Scanner v1.2, clipper v1.2, and parallel pipeline v1.2 built in the same session. 15 clips across 10 songs encoded simultaneously.

v1.5 beats FLAC-12 on 7 of 15 clips. Best win: Lady Gaga "Die With A Smile" — 10.3% smaller than FLAC's best. All files lossless-verified via SHA3-256.
Chapter III · Architecture

What Makes It Different

FLAC uses one prediction model for every frame of audio. 8Z-Audio uses up to 600 candidates, evaluated by MDL, with an adaptive search governor that learns which candidates work for this particular signal.

8Z-Audio v1.5 Frame Pipeline
WAV Frame 16384 samples
DCC Governor Budget allocation
Candidates ~600 configs
MDL Battle True total cost
Best Predictor Rice-coded residuals
.8za Frame SHA3 verified
📐

LPC Orders 1–32

Vs. FLAC's 0–12. Higher orders capture longer-range correlations in complex harmonic content.

🪟

Three Window Functions

Hann, Tukey, and no apodization — evaluated per frame. FLAC uses no apodization at its default settings.

🎯

QLP Precision Search

Exhaustive search over 8–16 bits of quantization precision per frame. FLAC uses a fixed precision per level.

🧠

DCC + CodecLearner

The Digital Claustrum Controller learns per-signal statistics, adapting search depth to where the signal is hardest.

⚖️

MDL Model Selection

Every bit counts. MDL sees the true total cost including predictor overhead — no bit is "free."

🔁

All Stereo Modes

Mid-Side, Left-Side, Right-Side, and Independent — selected per frame by MDL, not globally per file.

Chapter IV · Results

The Benchmark

15 clips across 10 songs — selected by the 8Z Audio Scanner v1.2 to cover all difficulty categories. Encoded in parallel with 8 workers. All results verified lossless (SHA3-256). Compression ratio: lower is better (fraction of original size).

8Z-Audio v1.5 vs FLAC-12 · Compression Advantage per Clip
Clip
Category
8ZA v1.5
FLAC-5
OFR
vs FLAC-12
Lady Gaga — Die With A Smile
diverse
0.2090
0.2375
0.1911
▲ 10.3%
Pink Floyd — SOYCD (10s tonal)
tonal
0.2515
0.2744
0.2353
▲ 5.2%
BD — Fractured Harmony (tonal)
tonal
0.2535
0.2775
0.2319
▲ 5.1%
Pink Floyd — SOYCD (30s dynamic)
dynamic
0.2004
0.2208
0.1802
▲ 3.0%
BD — Fractured Harmony (easiest)
easiest
0.3405
0.3670
0.3044
▲ 2.9%
BD — Awakening AI (tonal)
tonal
0.3722
0.4207
0.3338
▲ 1.5%
Metallica — Lux Æterna (buildup)
buildup
0.7935
0.7943
0.7734
▲ 0.03%
BD — Lacrimosa Requiem
diverse
0.5592
0.6347
0.5189
▼ 0.2%
Metallica — Lux Æterna (transient)
transient
0.6449
0.6448
0.6158
▼ 0.2%
BD — Between the Strings
dcc_stress
0.4770
0.4879
0.4378
▼ 1.3%
BD — Echoes of the Shore
dcc_stress
0.5378
0.5961
0.4901
▼ 1.8%
BD — Where Do I Go
dcc_stress
0.5263
0.5236
0.4793
▼ 2.0%
Rammstein — Du Hast (diverse)
diverse
0.4125
0.4122
0.5162
▼ 2.7%
Rammstein — Du Hast (hardest)
hardest
0.5504
0.5246
0.7255
▼ 5.7%
Rammstein — Du Hast (60s DCC)
dcc_best
0.5145
0.4818
0.6008
▼ 8.0%

Total bytes — 8ZA: 30,730,661 · FLAC-12: 30,416,285 · FLAC-5: 31,660,589 · OFR: 30,531,285

Chapter V · Key Discoveries

What the Data Revealed

1.9×

AI Audio Is More Predictable

Scanner data across 10 songs proves AI-generated audio (Producer.ai) has mean difficulty 0.25 vs. human recordings at 0.50 at matched sample rates. AI music is structurally simpler — more sustained tones, less transient chaos. Business implication: a specialized codec for AI audio platforms could achieve significantly better ratios than general-purpose lossless codecs.

u=10→15

DCC Settled — First Time

The Digital Claustrum Controller reached a stable learned state on Pink Floyd 192kHz (30s, 563 frames). The critical parameter: 4-bit events instead of 7-bit. The finer granularity of 7-bit encoding produced too much noise for the controller to find a pattern. Coarser events → cleaner signal → stable adaptation. Two-pass architecture will eliminate the sequential bottleneck entirely.

+39%

OptimFROG's Genre Weakness

Industrial music (Rammstein) consistently catastrophically fails in OptimFROG — 39.3% worse than FLAC on the hardest 10-second clip, 26–28% worse on the others. 8Z-Audio also loses on Rammstein, but far less badly (+3–8%). This is a specific architectural opportunity: if 8Z can handle industrial music better, it becomes a competitive differentiator in an otherwise mature field.

∀ domains

Cross-Domain Transfer Works

DCC: TSP solver → FASTA encoder → audio encoder. Scanner concept: DNA pipeline → audio difficulty pre-screener. MDL framework: 8Z image encoder → audio frame selection. The same architectural patterns — MDL arbitration, per-block model competition, adaptive search governors — produce gains across every domain they've been applied to.

100%

Perfect Category Win Rates

On tonal content: 3 for 3. On dynamic content: 1 for 1. On the "easiest" category: 1 for 1. On buildups: 1 for 1. 8Z-Audio's multi-model approach was purpose-built for structured content — and the benchmark confirmed it. The remaining losses are concentrated in industrial/transient content where LPC's weaknesses are known and targeted for v1.6.

≡ Frozen

The Field Is 18 Years Stale

Every major lossless audio codec — FLAC (2001), WavPack (2002), TAK (2007), OptimFROG, Monkey's Audio — uses LPC as its primary or sole prediction model. No production codec performs per-frame multi-model MDL selection, harmonic modeling, or periodic template matching. The field has been architecturally frozen since ~2007. 8Z-Audio is the first systematic attempt to break out of that constraint.

Chapter VI · What's Next

v1.6 — Two-Pass Architecture

The core innovation of v1.6 is separation of concerns: signal analysis (fast, sequential) is decoupled from encoding (slow, parallelizable). Scanner v1.2 becomes Pass 1. The encoder reads its output and processes all frames independently.

Priority 1 · Speed

Two-Pass Parallel

Pass 1 (Scanner): ~10s. Pass 2 (Encoder): independent frames, N workers.

40 min → 5 min target
Priority 2 · Adaptive

Adaptive Blocksize

Rate-matched block sizes: 16384 at 44.1kHz, 8192 at 96kHz, 4096 at 192kHz.

+1.6% on 192kHz content
Priority 3 · Hard Content

Rice Partition Opt.

Improved partition parameter search to close the gap on high-entropy frames.

+0.5–1% on transient/industrial
Priority 4 · Entropy

rANS Coding

Replace Rice on high-entropy residuals with range asymmetric numeral systems.

+0.5–1% on hard content
Priority 5 · Tonal

Periodic Predictor

Exploit sample repetition directly in sustained notes and synthesizer content.

+1–3% on tonal clips
Priority 6 · Novel

Harmonic Predictor

Sinusoidal modeling for harmonic content. No lossless audio codec currently does this.

+2–5% on tonal (projected)

v1.6 Targets

From the benchmark session post-mortem — everything needed to push beyond 50% win rate.

7 / 15 (47%)
10 / 15
Clips beating FLAC-12
10.3%
15%+
Best single-clip win
40 min (PF 192kHz)
5–10 min
Encode time target
75 min (14 clips, 8×)
15 min
Parallel pipeline
The Music

10 Songs, 6 Playable

Six original compositions by Bojan Dobrečevič, created via Producer.ai — used as the AI half of the 8Z-Audio benchmark corpus. Scanner data shows AI-generated audio is 1.9× more compressible than human recordings, making them ideal test subjects. Play the 10–30 second benchmark clip (WAV) or the full song (MP3).

▶   Select a song below to start listening
Fractured Harmony
easiest tonal
8ZA vs FLAC-12: +5.1% win · +2.9% win
Where Do I Go
dcc_stress
8ZA vs FLAC-12: −2.0%
Awakening AI
tonal
8ZA vs FLAC-12: +1.5% win
Echoes of the Shore
dcc_stress
8ZA vs FLAC-12: −1.8%
Lacrimosa Requiem
diverse
8ZA vs FLAC-12: −0.2%
Between the Strings
dcc_stress
8ZA vs FLAC-12: −1.3%
Reference Recordings — Copyright protected, no player
Pink Floyd
Shine On You Crazy Diamond
192 kHz · 24-bit progressive
8ZA: +3.0% (dynamic) · +5.2% (tonal)
Metallica
Lux Æterna
thrash metal 44.1 kHz
8ZA: +0.03% (buildup) · −0.2% (transient)
Lady Gaga
Die With A Smile
pop 44.1 kHz
8ZA: +10.3% ← best win overall
Rammstein
Du Hast
industrial 48 kHz
8ZA: −2.7% to −8.0%
OFR catastrophic: −26 to −39% on industrial