OneCharacterCode V3 Stacked Compression Test

OCC + Gzip Transport Size — the fair comparison: gzip(raw) vs gzip(OCC V3 carrier), with full receiver-side reconstruction verified.

Loading test-run metadata…

The correction this page answers. “This test answers Bret Fencl's correction: if OneCharacterCode data is transmitted, it can still be compressed again with gzip. The fair transport comparison is gzip(raw) versus gzip(OCC carrier).”

What is measured

baseline transport:  raw → gzip(raw)
stacked  transport:  raw → OCC V3 → gzip(OCC V3)
receiver:  gzip(OCC V3) → gunzip → decode V3 → raw bytes
verification:  SHA-256(original) ≡ SHA-256(reconstructed)

For each file we report the actual transmitted byte counts on both paths, the winner, and the receiver-side roundtrip status. The roundtrip must PASS and the SHA-256 hashes must match exactly for any number on this page to count.

Rule. An OCC win is declared only when gzip(OCC V3) is strictly smaller than gzip(raw) and the receiver-side roundtrip passes. Otherwise the page reports “Gzip(raw) still wins for this file.”

Results

Gzip(OCC) vs gzip(raw) % = (1 − gzip(OCC V3) / gzip(raw)) × 100. Positive = smaller, negative = larger. Roundtrip must PASS and SHA-256 must match.
File Raw Bytes Gzip(raw) Bytes OCC V3 Bytes Gzip(OCC V3) Bytes Best Transport Winner Gzip(OCC) vs Gzip(raw) Roundtrip SHA-256 Match
Loading stacked-compression results…

Flow

Raw Input
OCC V3 Encode
3-tier dictionary
Gzip
Transmit / Store
Gunzip
OCC V3 Decode
SHA-256 Verify

Downloads

Technical notes

Why this test exists. The earlier V3 page reported file-compression numbers (OCC V3 carrier vs. raw) and a separately-labeled bandwidth simulation. It did not answer the natural follow-up: in real transport you don't ship the OCC carrier bare; you also gzip it on the wire. The fair comparison is therefore gzip(raw) vs. gzip(OCC carrier). That is what this page measures.

What the result means if Gzip(raw) wins. The OCC V3 dictionary substitutions consume some of the redundancy that gzip would otherwise have exploited. When the input is short and English-like, gzip alone already captures most of the redundancy with its LZ77 window + Huffman coding; running OCC first removes structure that gzip needed, and the second-stage gzip cannot make up the gap. This is a real cost and we report it as a real loss.

What it would take for OCC to win this test. Inputs where the dictionary entries genuinely add information gzip's 32 KB window cannot see — very long range repeats, larger inputs, or structured corpora — should narrow or flip the comparison. So should an entropy coder downstream of the dictionary that does not pre-collide with gzip's redundancy model. Both are listed under “next optimization” below.

What this test is not. Not a comparison against zstd / xz / lzma. Not a claim about larger or differently structured inputs. Not the production OneCharacterCode engine. The encoder used here is the exact V3 prototype that produced the public V3 file-compression numbers.