Skip to content

Conversation

@fracai
Copy link

@fracai fracai commented Oct 15, 2023

I came across a reference to a "slicing-by-2" technique, but I haven't seen this actually implemented anywhere.

This PR implements that algorithm and the speed slots right in between the single byte and slicing-by-4 variants.

On my system:

$ ./Crc32Test 
Please wait ...
bitwise          : CRC=221F390F, 8.717s, 117.475 MB/s
half-byte        : CRC=221F390F, 5.740s, 178.384 MB/s
tableless (byte) : CRC=221F390F, 6.606s, 155.019 MB/s
tableless (byte2): CRC=221F390F, 9.292s, 110.207 MB/s
  1 byte  at once: CRC=221F390F, 2.707s, 378.304 MB/s
  2 bytes at once: CRC=221F390F, 1.687s, 606.836 MB/s
  4 bytes at once: CRC=221F390F, 0.969s, 1056.677 MB/s
  8 bytes at once: CRC=221F390F, 0.535s, 1914.827 MB/s
4x8 bytes at once: CRC=221F390F, 0.536s, 1909.030 MB/s
 16 bytes at once: CRC=221F390F, 0.281s, 3650.535 MB/s
 16 bytes at once: CRC=221F390F, 0.280s, 3651.705 MB/s (including prefetching)
    chunked      : CRC=221F390F, 0.281s, 3644.816 MB/s

I don't have a way to test the Arduino code, but it "looks right to me".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant