Skip to content

Commit c63f0af

Browse files
authored
Add block visualizer (#32)
* Add block visualizer * doc tweaks. * Add stats. * Remove Len and annoying scrollbar * More stats
1 parent 5a44d11 commit c63f0af

File tree

3 files changed

+1684
-16
lines changed

3 files changed

+1684
-16
lines changed

README.md

Lines changed: 25 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,19 @@
11
# MinLZ
22

3-
MinLZ is a LZ77-type compressor with a fixed byte-aligned encoding, in the similar class to Snappy and LZ4.
3+
MinLZ is a LZ77-type compressor with a fixed byte-aligned encoding,
4+
in the similar class to Snappy and LZ4.
45

5-
The goal of MinLZ is to provide a fast, low memory compression algorithm that can be used for fast compression of data,
6-
where encoding and/or decoding speed is the primary concern.
6+
The goal of MinLZ is to provide a fast, low memory compression algorithm
7+
that can be used for fast compression of data, where encoding
8+
and/or decoding speed is the primary concern.
79

8-
MinLZ is designed to operate *faster than IO* for both compression and decompression and be a viable "always on"
9-
option even if some content already is compressed.
10-
If slow compression is acceptable, MinLZ can be configured to produce a high compression ratio,
11-
while retaining high decompression speed.
10+
MinLZ is designed to operate *faster than IO* for both
11+
compression and decompression.
12+
13+
It is a viable "always on" option, even if some content already is compressed.
14+
15+
If slow compression is acceptable, MinLZ can be configured to produce a high
16+
compression ratio while retaining a high decompression speed.
1217

1318
* Best in class compression
1419
* Block or Streaming interfaces
@@ -18,8 +23,8 @@ while retaining high decompression speed.
1823
* Adjustable Compression (4 levels)
1924
* Concurrent stream Compression
2025
* Concurrent stream Decompression
21-
* Skip forward in compressed stream via independent blocks
22-
* Random seeking with optional indexes
26+
* Skip forward in compressed streams via independent blocks
27+
* Random seeking with optional index
2328
* Stream EOF validation
2429
* Automatic stream size padding
2530
* Custom encoders for small blocks
@@ -356,12 +361,12 @@ but for completeness also test with `-tags=purego`.
356361
## BLOCKS
357362

358363
Individual block benchmarks should be considered carefully - and can be hard to generalize,
359-
since they tend to over-emphasize specific characteristics of the content.
364+
since they tend to overemphasize specific characteristics of the content.
360365

361366
Therefore, it will be easy to find counter-examples to the benchmarks, where specific patterns suit a
362367
specific compressor better than others.
363368
We present a few examples from the [Snappy benchmark set](https://github.com/google/snappy/tree/main/testdata).
364-
As a benchmark this set has an over-emphasis on text files.
369+
As a benchmark, this set has an over-emphasis on text files.
365370

366371
Blocks are compressed/decompress using 16 concurrent threads on an AMD Ryzen 9 3950X 16-Core Processor.
367372
Click below to see some sample benchmarks compared to Snappy and LZ4:
@@ -484,16 +489,20 @@ We encourage you to do your own testing with realistic blocks.
484489

485490
You can use `λ mz c -block -bench=10 -verify -cpu=16 -1 file.ext` with our commandline tool to test speed of block encoding/decoding.
486491

492+
### Visualizer
493+
494+
You can visualize individual blocks using our [block visualizer](https://minlz.klauspost.com/).
495+
487496
## STREAMS
488497

489498
For fair stream comparisons, we run each encoder at its maximum block size
490-
or max 4MB, while maintaining independent blocks where it is an option.
499+
or max 4MB, while maintaining independent blocks where it is an option.
491500
We use the concurrency offered by the package.
492501

493502
This means there may be further speed/size tradeoffs possible for each,
494-
so experiment with fine tuning for your needs.
503+
so experiment with fine-tuning for your needs.
495504

496-
Blocks are compressed/decompress using 16 core AMD Ryzen 9 3950X 16-Core Processor.
505+
Blocks are compressed/decompressed using 16 core AMD Ryzen 9 3950X 16-Core Processor.
497506

498507
### JSON Stream
499508

@@ -624,10 +633,10 @@ Source file: https://mattmahoney.net/dc/10gb.html
624633

625634
Our conclusion is that the new compression algorithm provides a good compression increase,
626635
while retaining the ability to saturate pretty much any IO either with compression or
627-
decompression given a moderate amount of CPU cores.
636+
decompression given a moderate number of CPU cores.
628637

629638

630-
## Why is concurrent block and stream speed so different?
639+
## Why are concurrent block and stream speeds so different?
631640

632641
In most cases, MinLZ will be limited by memory bandwidth.
633642

0 commit comments

Comments
 (0)