The original seed was simple: if a file contains bits that can be described as digits or bits of π, why store those bits directly? PiC turns that seed into a strict MDL experiment: use π, e, φ, √2 and other deterministic streams only when a pointer, transform and residual are cheaper than ordinary storage or compression.
Short name: PiC = Pi Compression / Constants Compression. It is the compression sibling of PiX.
PiC preserves the original intuition without turning it into a magical claim. The idea is allowed to live, but only behind a hard accounting gate.
Can data in a file be replaced by coordinates inside π?
Store constant ID, offset, length, transform and residual instead of raw bits.
Accept only if the whole description is shorter than RAW, zstd, PNG/FLAC, or other baselines.
PiC is a falsifiable experiment, not a statement that π compresses ordinary files.
constant_stream[offset:length] + transform + residual, but only if this is cheaper and reconstructs exactly, or meets a declared lossy quality target.A file is not naturally decimal text. It is a byte stream and therefore also a bit stream. π is a decimal digit stream, but it can be converted into several binary or symbolic streams. PiC searches for useful alignments between them.
file bytes: 137 080 078 071 ... file bits: 10001001 01010000 01001110 01000111 ... chunk: b[i : i+n]
π digits: 314159265358979323846... π as BCD: 0011 0001 0100 0001 0101 1001 ... π as bytes: 24 3F 6A 88 ... (hex/binary stream)
This is the simple version you were imagining: a file bit segment equals π digits converted to bits. The encoder stores the π pointer instead of the literal segment.
file bits around segment:
11001010 0011000101000001010110010010 01101100
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
28 bits to replace
π digits: 3 1 4 1 5 9 2 BCD: 0011 0001 0100 0001 0101 1001 0010 store: mode = PI_BCD offset = 0 len_digits= 7
PiC should not treat π as the only source. π is one deterministic tape in a family of tapes. The claim only becomes interesting if a constant beats random controls and other constants under the same budget.
| Mode | Meaning | Best use | Risk |
|---|---|---|---|
| PI_BCD | Decimal digits encoded as 4-bit nibbles. | Your direct “π digits to bits” idea. | Wasteful: only 10 of 16 nibble values used. |
| PI_ASCII | Digits stored as ASCII bytes: “3”, “1”, “4”… | Text-like or symbolic files. | Usually bad for binary files. |
| PI_DEC3 | File bytes written as decimal triples 000–255, then searched in π. | Human-readable bridge from bytes to decimal digits. | Triples above 255 are invalid unless remapped. |
| PI_HEX / PI_BIN | Use hexadecimal/binary expansion of constants as true bytes/bits. | Cleanest raw byte-level PiC mode. | Exact matches should be rare in normal data. |
| CONST_XOR | Encode chunk as constant stream XOR residual. | Near-matches and sparse differences. | Residual may cost as much as original. |
| CONST_CA | Seed a cellular automaton from π/e/φ/√2 and compare generated texture. | Images, masks, grids, synthetic textures. | Easy to overfit without controls. |
| CONST_LOSSY | Use constants as texture/noise/basis for controlled lossy coding. | Image/audio/sensor residual shaping. | Must beat normal lossy codecs at same quality. |
The strict version must reconstruct the exact original bytes. It is the right first test because it cannot hide behind visual similarity.
A PiC token is kept only if token + residual + checksum is smaller than the best ordinary candidate.
Decoder regenerates the constant stream, applies transform/residual, then verifies SHA3 or another strong hash of the reconstructed chunk.
If the pointer cost or residual cost loses, the encoder stores RAW, zstd, lzma, PNG, FLAC, or the normal 8Z winner.
PiCChunk {
mode: PI_BIN | PI_BCD | PI_DEC3 | CONST_XOR | CONST_CA
constant_id: PI | E | PHI | SQRT2 | RNG_CONTROL
offset: unsigned integer
length: unsigned integer
stride: optional integer
transform: optional transform ID + params
residual: compressed residual bytes
hash: reconstructed chunk hash
}
Yes: if lossy compression is allowed, π and other constants will be “found” much more often. That is powerful, but dangerous. Approximate matches are easy; meaningful wins are hard.
Constants can act as deterministic grain, masks, phase fields, tile orderings, or residual bases. Test at equal PSNR/SSIM or a fixed perceptual error budget.
Constants can act as deterministic dither/noise/residual scaffolds. Test at equal SNR, spectral error, or perceptual score against normal codecs.
The core formula is simple. PiC is not accepted because it is beautiful. It is accepted only when the total description is cheaper.
L_total = L_token + L_transform + L_residual + L_hash accept if: L_total < min(L_RAW, L_zstd, L_lzma, L_png, L_8Z)
L_total = L_token + L_transform + L_residual + λD accept if: rate is lower at equal declared distortion D
Use this to feel the central obstacle. Small matches lose because the pointer is bigger than the data it replaces.
PiC becomes real only if it survives boring controls. The goal is not to protect π. The goal is to discover whether constants give any useful generator family at all.
Compare π against e, φ, √2, a cryptographic-quality deterministic random stream, shuffled π, and standard codecs.
If random wins equally often, the signal is not π. It may still be useful as generator search, but not as a special π claim.
If PiC cannot beat controls after fair tuning, keep it as a creative/scouting branch, not as a compression claim.
Pi Hunter was an important precursor, but it was not yet this raw-file compressor. It searched symbolic image textures against π/CA streams. PiC is the sharper file-compression formulation.
| System | What it reads | What it searches | Status |
|---|---|---|---|
| Pi Hunter v2.3 | Image resized/quantized into a symbolic grid. | π digit streams, π-seeded CA rules, spatial mappings, residual scoring. | Scout / texture testbed, not a full codec. |
| PiX | Sequences, levels, rhythms, schedules, decision candidates. | π as universal MDL candidate generator across domains. | Umbrella page and sequence laboratory. |
| PiC | Raw bytes/bits/samples/pixels/chunks. | Constant pointer + transform + residual under MDL. | This page: clean spec seed for the next experiment. |
The smallest useful program is not a full 8Z integration. It is a scout that directly tests your original idea on chunks and reports honest wins/losses.
1. PI_BCD exact 2. PI_BIN exact 3. PI_DEC3 byte triples 4. CONST_XOR residual 5. CONST_CA symbolic texture 6. RNG/e/φ/√2 controls
raw BMP / TIFF / PGM raw WAV / PCM FASTA / CSV / plain text synthetic planted controls already-compressed PNG/JPEG/MP3 as negative controls
PiC should eventually become one optional generator family inside 8Z, not a replacement for normal compression. DCC chooses when to try it; MDL decides whether it survives.
Try PiC candidates only on chunks where cheap scouts suggest possible structure. Record timing, false positives, residual size, and final MDL decision.
No search. The decoder only regenerates the declared constant stream, applies declared transform/residual, and verifies the hash.