OneCharacterCode Real-World Benchmark Demo

A reproducible compression and reconstruction test comparing raw files, standard compression, and OneCharacterCode symbolic encoding.

Loading test-run metadata…

Important disclosure. This page reports the exact results of this test run. It does not claim universal compression superiority. The OCC values come from a prototype symbolic dictionary encoder, clearly labeled as such, which is included so the reconstruction round-trip can be verified locally by anyone with PowerShell. The prototype is not the final patented OneCharacterCode engine. Reproducibility files and SHA-256 hashes are included.

Results

Raw bytes are the input file size. Gzip / Brotli are standard compressors. OCC Symbolic is the prototype encoder. Reduction % shows OCC vs. raw; positive numbers compress, negative numbers expand. Reconstruction must PASS for any compression number to be meaningful.
Test File Raw Bytes Gzip Bytes Brotli Bytes OCC Symbolic Bytes OCC Reduction Reconstruction SHA-256 Match
Loading results…

Flow

Raw Input
Symbolic Encoding
Tiny Carrier
Local Reconstruction
Hash Match

Downloads

Technical notes

Symbolic encoding. A symbolic encoder finds recurring patterns in the input and assigns each pattern a short symbol. The output is a small dictionary plus a body in which patterns are replaced by symbols. The encoded form is then expanded back to the original at read time. Standard compressors such as gzip and Brotli include this idea among many others (LZ77 sliding window, Huffman coding, context modeling), which is why they are mature and hard to beat on short, English-like inputs.

Spiral-inspired mapping. The full OneCharacterCode design treats the symbol space as a navigable structure (think of an index that spirals outward by frequency) so that the carrier file references a much larger external lexicon rather than carrying its own dictionary. That design is experimental architecture; this page does not demonstrate it. This page demonstrates only the simpler prototype substring-dictionary step that the production engine builds on.

Local reconstruction. Every test on this page is end-to-end verified: the encoded file is decoded back into bytes, and those bytes are SHA-256 hashed and compared against the SHA-256 hash of the original input. A compression number with a failed reconstruction is meaningless and is reported as FAIL.

Benchmark limitations. The three inputs here are short (kilobyte scale). Compression behavior changes substantially on longer inputs and on different content types. Brotli is part of .NET Core 2.1+ and is not present in Windows PowerShell 5.1 / .NET Framework 4.x; the Brotli column reads n/a when the runtime does not provide it. Future runs should include zstd, xz, lzma, and larger and more diverse corpora.

Independent testing. The point of this page is reproducibility. Any reader can download README_REPRODUCE.txt, run the same PowerShell script on the same inputs, and confirm the same hashes and the same byte counts.