@@ -227,7 +227,8 @@ the `WriteTo` functionality.
227227```
228228
229229The ` DecompressConcurrent ` has similar functionality to ` WriteTo ` , but allows specifying the concurrency.
230- By default ` WriteTo ` uses ` runtime.NumCPU() ` concurrent decompressors.
230+ By default ` WriteTo ` uses ` runtime.NumCPU() ` or at most 8 concurrent decompressors.
231+ Besides offering higher throughput using ` DecompressConcurrent ` will also make input reads async when used.
231232
232233For memory-sensitive systems, the maximum block size can be set below 8MB. For this use the ` ReaderMaxBlockSize(int) `
233234option.
@@ -309,6 +310,14 @@ The following build tags can be used to control which speed improvements are use
309310
310311Using assembly/non-assembly versions will often produce slightly different output.
311312
313+ We will support 2 releases prior to current Go release version.
314+
315+ This package has been extensively fuzz tested to ensure that no data input can cause
316+ crashes or excessive memory usage.
317+
318+ When doing fuzz testing, use ` -tags=nounsafe ` . Non-assembly functions will also be tested,
319+ but for completeness also test with ` -tags=purego ` .
320+
312321# Performance
313322
314323## BLOCKS
@@ -324,90 +333,106 @@ As a benchmark this set has an over-emphasis on text files.
324333Blocks are compressed/decompress using 16 concurrent threads on an AMD Ryzen 9 3950X 16-Core Processor.
325334Click below to see some sample benchmarks compared to Snappy and LZ4:
326335
336+ ### Protobuf Sample
337+
327338<details >
328- <summary >Protobuf (118,588 bytes input)</summary >
339+ <summary >Click To See Data + Charts (118,588 bytes input)</summary >
329340
330341| Compressor | Size | Comp MB/s | Decomp MB/s | Reduction % |
331- | --------------| --------| ----------- | -------------| -------------|
342+ | --------------| --------| ----------: | -------------| -------------|
332343| MinLZ 1 | 17,613 | 27,837 | 116,762 | 85.15% |
333344| MinLZ 1 (Go) | 17,479 | 22,036 | 61,652 | 85.26% |
334345| MinLZ 2 | 16,345 | 12,797 | 103,100 | 86.22% |
335346| MinLZ 2 (Go) | 16,345 | 9,732 | 52,964 | 86.22% |
336347| MinLZ 3 | 14,766 | 210 | 126,385 | 87.55% |
337348| MinLZ 3 (Go) | 14,766 | | 68,411 | 87.55% |
338349| Snappy | 23,335 | 24,052 | 61,002 | 80.32% |
339- | Snappy (Go) | 23,335 | | 35,699 | 80.32% |
350+ | Snappy (Go) | 23,335 | 10,055 | 35,699 | 80.32% |
340351| LZ4 0 | 18,766 | 12,649 | 137,553 | 84.18% |
341352| LZ4 0 (Go) | 18,766 | | 64,092 | 84.18% |
342353| LZ4 9 | 15,844 | 12,649 | 139,801 | 86.64% |
343354| LZ4 9 (Go) | 15,844 | | 66,904 | 86.64% |
344355
356+ ![ Compression vs Size] ( img/pb-block.png )
357+
345358Source file: https://github.com/google/snappy/blob/main/testdata/geo.protodata
346359
347360</details >
348361
362+ ### HTML Sample
363+
349364<details >
350- <summary >HTML (102,400 bytes input)</summary >
365+ <summary >Click To See Data + Charts (102,400 bytes input)</summary >
351366
352367| Compressor | Size | Comp MB/s | Decomp MB/s | Reduction % |
353- | --------------| --------| ----------- | -------------| -------------|
368+ | --------------| --------| ----------: | -------------| -------------|
354369| MinLZ 1 | 20,184 | 17,558 | 82,292 | 80.29% |
355370| MinLZ 1 (Go) | 19,849 | 15,035 | 32,327 | 80.62% |
356371| MinLZ 2 | 17,831 | 9,260 | 58,432 | 82.59% |
357372| MinLZ 2 (Go) | 17,831 | 7,524 | 25,728 | 82.59% |
358373| MinLZ 3 | 16,025 | 180 | 80,445 | 84.35% |
359374| MinLZ 3 (Go) | 16,025 | | 33,382 | 84.35% |
360375| Snappy | 22,843 | 17,469 | 44,765 | 77.69% |
361- | Snappy (Go) | 22,843 | | 21,082 | 77.69% |
376+ | Snappy (Go) | 22,843 | 8,161 | 21,082 | 77.69% |
362377| LZ4 0 | 21,216 | 9,452 | 101,490 | 79.28% |
363378| LZ4 0 (Go) | 21,216 | | 40,674 | 79.28% |
364379| LZ4 9 | 17,139 | 1,407 | 95,706 | 83.26% |
365380| LZ4 9 (Go) | 17,139 | | 39,709 | 83.26% |
366381
382+ ![ Compression vs Size] ( img/html-block.png )
383+
367384Source file: https://github.com/google/snappy/blob/main/testdata/html
368385
369386</details >
370387
388+ ### URL List Sample
389+
371390<details >
372- <summary >URLs. (702,087 bytes input)</summary >
391+ <summary >Click To See Data + Charts (702,087 bytes input)</summary >
373392
374393| Compressor | Size | Comp MB/s | Decomp MB/s | Reduction % |
375- | --------------| ---------| ----------- | -------------| -------------|
394+ | --------------| ---------| ----------: | -------------| -------------|
376395| MinLZ 1 | 268,803 | 9,774 | 30,961 | 61.71% |
377396| MinLZ 1 (Go) | 260,937 | 7,935 | 17,362 | 62.83% |
378397| MinLZ 2 | 230,280 | 5,197 | 26,871 | 67.20% |
379398| MinLZ 2 (Go) | 230,280 | 4,280 | 13,926 | 67.20% |
380399| MinLZ 3 | 207,303 | 226 | 28,716 | 70.47% |
381400| MinLZ 3 (Go) | 207,303 | | 15,256 | 70.47% |
382401| Snappy | 335,492 | 9,398 | 24,207 | 52.22% |
383- | Snappy (Go) | 335,492 | | 12,359 | 52.22% |
402+ | Snappy (Go) | 335,492 | 4,683 | 12,359 | 52.22% |
384403| LZ4 0 | 299,342 | 4,462 | 51,220 | 57.36% |
385404| LZ4 0 (Go) | 299,342 | | 23,242 | 57.36% |
386405| LZ4 9 | 252,182 | 638 | 45,295 | 64.08% |
387406| LZ4 9 (Go) | 252,182 | | 16,240 | 64.08% |
388407
408+ ![ Compression vs Size] ( img/urls-block.png )
409+
389410Source file: https://github.com/google/snappy/blob/main/testdata/urls.10K
390411
391412</details >
392413
414+ ### Serialized GEO data Sample
415+
393416<details >
394- <summary >Serialized binary. (184,320 bytes input)</summary >
417+ <summary >(184,320 bytes input)</summary >
395418
396419| Compressor | Size | Comp MB/s | Decomp MB/s | Reduction % |
397- | --------------| --------| ----------- | -------------| -------------|
420+ | --------------| --------| ----------: | -------------| -------------|
398421| MinLZ 1 | 63,595 | 8,319 | 26,170 | 65.50% |
399422| MinLZ 1 (Go) | 62,087 | 7,601 | 12,118 | 66.32% |
400423| MinLZ 2 | 54,688 | 5,932 | 24,688 | 70.33% |
401424| MinLZ 2 (Go) | 52,752 | 4,690 | 10,566 | 71.38% |
402425| MinLZ 3 | 46,002 | 230 | 28,083 | 75.04% |
403426| MinLZ 3 (Go) | 46,002 | | 12,877 | 75.04% |
404427| Snappy | 69,526 | 10,198 | 19,754 | 62.28% |
405- | Snappy (Go) | 69,526 | | 8,712 | 62.28% |
428+ | Snappy (Go) | 69,526 | 5,031 | 8,712 | 62.28% |
406429| LZ4 0 | 66,506 | 5,355 | 45,305 | 63.92% |
407430| LZ4 0 (Go) | 66,506 | | 15,757 | 63.92% |
408431| LZ4 9 | 50,439 | 88 | 52,877 | 72.64% |
409432| LZ4 9 (Go) | 50,439 | | 18,171 | 72.64% |
410433
434+ ![ Compression vs Size] ( img/geo-block.png )
435+
411436Source file: https://github.com/google/snappy/blob/main/testdata/kppkn.gtb
412437
413438</details >
@@ -417,22 +442,160 @@ In overall terms, we typically observe that:
417442* The fastest mode typically beats LZ4 both in speed and output size.
418443* The fastest mode is typically equal to Snappy in speed, but significantly smaller.
419444* The "balanced" mode typically beats the best possible LZ4 compression, but much faster.
445+ * Without assembler MinLZ is mostly the fastest option for compression.
420446* LZ4 is decompression speed king.
421447* Snappy decompression is usually slowest — especially without assembly.
422448
423449We encourage you to do your own testing with realistic blocks.
424450
425- You can use ` λ mz c -block -bench=10 -verify -cpu=16 -1 file.ext ` with our commandline tool.
451+ You can use ` λ mz c -block -bench=10 -verify -cpu=16 -1 file.ext ` with our commandline tool to test speed of block encoding/decoding .
426452
427453## STREAMS
428454
429455For fair stream comparisons, we run each encoder at its maximum block size
430- while maintaining independent blocks where it is an option.
456+ or max 4MB, while maintaining independent blocks where it is an option.
431457We use the concurrency offered by the package.
432458
433459This means there may be further speed/size tradeoffs possible for each,
434460so experiment with fine tuning for your needs.
435461
462+ Blocks are compressed/decompress using 16 core AMD Ryzen 9 3950X 16-Core Processor.
463+
464+ ### JSON Stream
465+
466+ <details >
467+ <summary >Click To See Data + Charts</summary >
468+
469+ Input Size: 6,273,951,764 bytes
470+
471+ | Compressor | Speed MiB/s | Size | Reduction | Dec MiB/s |
472+ | -------------| ------------:| --------------:| ----------:| ----------:|
473+ | MinLZ 1 | 14,921 | 974,656,419 | 84.47% | 3,204 |
474+ | MinLZ 2 | 8,877 | 901,171,279 | 85.64% | 3,028 |
475+ | MinLZ 3 | 576 | 742,067,802 | 88.17% | 3,835 |
476+ | S2 Default | 15,501 | 1,041,700,255 | 83.40% | 2,378 |
477+ | S2 Better | 9,334 | 944,872,699 | 84.94% | 2,300 |
478+ | S2 Best | 732 | 826,384,742 | 86.83% | 2,572 |
479+ | LZ4 Fastest | 5,860 | 1,274,297,625 | 79.69% | 2,680 |
480+ | LZ4 Best | 1,772 | 1,091,826,460 | 82.60% | 2,694 |
481+ | Snappy | 951 | 1,525,176,492 | 75.69% | 1,828 |
482+ | Gzip L5 | 236 | 938,015,731 | 85.05% | 557 |
483+
484+ ![ Compression vs Size] ( img/json-v1-comp.png )
485+ ![ Decompression Speed] ( img/json-v1-decomp.png )
486+
487+ Source file: https://files.klauspost.com/compress/github-june-2days-2019.json.zst
488+
489+ </details >
490+
491+ ### CSV Stream
492+
493+ <details >
494+ <summary >Click To See Data + Charts</summary >
495+
496+ Input Size: 3,325,605,752 bytes
497+
498+ | Compressor | Speed MiB/s | Size | Reduction |
499+ | ------------| -------------| ---------------| -----------|
500+ | MinLZ 1 | 9,193 | 937,136,278 | 72.07% |
501+ | MinLZ 2 | 6,158 | 775,823,904 | 77.13% |
502+ | MinLZ 3 | 338 | 657,162,410 | 80.66% |
503+ | S2 Default | 10,679 | 1,093,516,949 | 67.12% |
504+ | S2 Better | 6,394 | 884,711,436 | 73.40% |
505+ | S2 Best | 400 | 773,678,211 | 76.74% |
506+ | LZ4 Fast | 4,835 | 1,066,961,737 | 67.92% |
507+ | LZ4 Best | 732 | 903,598,068 | 72.83% |
508+ | Snappy | 553 | 1,316,042,016 | 60.43% |
509+ | Gzip L5 | 128 | 767,340,514 | 76.93% |
510+
511+ ![ Compression vs Size] ( img/csv-v1-comp.png )
512+
513+ Source file: https://files.klauspost.com/compress/nyc-taxi-data-10M.csv.zst
514+
515+ </details >
516+
517+ ### Log data
518+
519+ <details >
520+ <summary >Click To See Data + Charts</summary >
521+
522+ Input Size: 2,622,574,440 bytes
523+
524+ | Compressor | Speed MiB/s | Size | Reduction |
525+ | ------------| -------------| -------------| -----------|
526+ | MinLZ 1 | 17,014 | 194,361,157 | 92.59% |
527+ | MinLZ 2 | 12,696 | 174,819,425 | 93.33% |
528+ | MinLZ 3 | 1,351 | 139,449,942 | 94.68% |
529+ | S2 Default | 17,131 | 230,521,260 | 91.21% |
530+ | S2 Better | 12,632 | 217,884,566 | 91.69% |
531+ | S2 Best | 1,687 | 185,357,903 | 92.93% |
532+ | LZ4 Fast | 6,115 | 216,323,995 | 91.75% |
533+ | LZ4 Best | 2,704 | 169,447,971 | 93.54% |
534+ | Snappy | 1,987 | 290,116,961 | 88.94% |
535+ | Gzip L5 | 498 | 142,119,985 | 94.58% |
536+
537+ ![ Compression vs Size] ( img/logs-v1-comp.png )
538+
539+ Source file: https://files.klauspost.com/compress/apache.log.zst
540+
541+ </details >
542+
543+ ### Serialized Data
544+
545+ <details >
546+ <summary >Click To See Data + Charts</summary >
547+
548+ Input Size: 1,862,623,243 bytes
549+
550+ | Compressor | Speed MiB/s | Size | Reduction |
551+ | ------------| -------------| -------------| -----------|
552+ | MinLZ 1 | 10,701 | 604,315,773 | 67.56% |
553+ | MinLZ 2 | 5,712 | 517,472,464 | 72.22% |
554+ | MinLZ 3 | 250 | 480,707,192 | 74.19% |
555+ | S2 Default | 12,167 | 623,832,101 | 66.51% |
556+ | S2 Better | 5,712 | 568,441,654 | 69.48% |
557+ | S2 Best | 324 | 553,965,705 | 70.26% |
558+ | LZ4 Fast | 5,090 | 618,174,538 | 66.81% |
559+ | LZ4 Best | 617 | 552,015,243 | 70.36% |
560+ | Snappy | 929 | 589,837,541 | 68.33% |
561+ | Gzip L5 | 166 | 434,950,800 | 76.65% |
562+
563+ ![ Compression vs Size] ( img/msgp-v1-comp.png )
564+
565+ Source file: https://files.klauspost.com/compress/github-ranks-backup.bin.zst
566+
567+ </details >
568+
569+ ### Backup (Mixed) Data
570+
571+ <details >
572+ <summary >Click To See Data + Charts</summary >
573+
574+ Input Size: 10,065,157,632 bytes
575+
576+ | Compressor | Speed MiB/s | Size | Reduction |
577+ | -------------| -------------| ---------------| -----------|
578+ | MinLZ 1 | 9,356 | 5,859,748,636 | 41.78% |
579+ | MinLZ 2 | 5,321 | 5,256,474,340 | 47.78% |
580+ | MinLZ 3 | 259 | 4,855,930,368 | 51.76% |
581+ | S2 Default | 10,083 | 5,915,541,066 | 41.23% |
582+ | S2 Better | 5,731 | 5,455,008,813 | 45.80% |
583+ | S2 Best | 319 | 5,192,490,222 | 48.41% |
584+ | LZ4 Fastest | 5,065 | 5,850,848,099 | 41.87% |
585+ | LZ4 Best | 287 | 5,348,127,708 | 46.86% |
586+ | Snappy | 732 | 6,056,946,612 | 39.82% |
587+ | Gzip L5 | 171 | 4,916,436,115 | 51.15% |
588+
589+ ![ Compression vs Size] ( img/10gb-v1-comp.png )
590+
591+ Source file: https://mattmahoney.net/dc/10gb.html
592+
593+ </details >
594+
595+ Our conclusion is that the new compression algorithm provides a good compression increase,
596+ while retaining the ability to saturate pretty much any IO either with compression or
597+ decompression given a moderate amount of CPU cores.
598+
436599
437600## Why is concurrent block and stream speed so different?
438601
@@ -484,6 +647,9 @@ Speed indications are base 10.
484647
485648### Compressing
486649
650+ <details >
651+ <summary >Click To Compression Help</summary >
652+
487653```
488654Usage: mz c [options] <input>
489655
@@ -534,9 +700,13 @@ Example:
534700λ mz c apache.log
535701Compressing apache.log -> apache.log.mz 2622574440 -> 170960982 [6.52%]; 4155.2MB/s
536702```
703+ </details >
537704
538705## Decompressing
539706
707+ <details >
708+ <summary >Click To Decompression Help</summary >
709+
540710```
541711Usage: mz d [options] <input>
542712
@@ -583,6 +753,7 @@ Example:
583753λ mz d apache.log.mz
584754Decompressing apache.log.mz -> apache.log 170960982 -> 2622574440 [1534.02%]; 2660.2MB/s
585755```
756+ </details >
586757
587758Tail, Offset and Limit can be made to forward to the next newline by adding ` +nl ` .
588759
0 commit comments