- 
                Notifications
    You must be signed in to change notification settings 
- Fork 38
Proposal: API Extension #715
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arm Cortex-A76 (Raspberry Pi 5) benchmarks
| Benchmark suite | Current: 7875c83 | Previous: ae2afe5 | Ratio | 
|---|---|---|---|
| ML-KEM-512 keypair | 30178cycles | 29432cycles | 1.03 | 
| ML-KEM-512 encaps | 34726cycles | 34675cycles | 1.00 | 
| ML-KEM-512 decaps | 45361cycles | 45262cycles | 1.00 | 
| ML-KEM-768 keypair | 51464cycles | 50035cycles | 1.03 | 
| ML-KEM-768 encaps | 55213cycles | 55249cycles | 1.00 | 
| ML-KEM-768 decaps | 70250cycles | 70288cycles | 1.00 | 
| ML-KEM-1024 keypair | 75244cycles | 73083cycles | 1.03 | 
| ML-KEM-1024 encaps | 81519cycles | 81542cycles | 1.00 | 
| ML-KEM-1024 decaps | 101972cycles | 101561cycles | 1.00 | 
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intel Xeon 4th gen (c7i)
| Benchmark suite | Current: 7875c83 | Previous: ae2afe5 | Ratio | 
|---|---|---|---|
| ML-KEM-512 keypair | 9635cycles | 9480cycles | 1.02 | 
| ML-KEM-512 encaps | 11273cycles | 11097cycles | 1.02 | 
| ML-KEM-512 decaps | 15350cycles | 15155cycles | 1.01 | 
| ML-KEM-768 keypair | 16605cycles | 16336cycles | 1.02 | 
| ML-KEM-768 encaps | 17685cycles | 17796cycles | 0.99 | 
| ML-KEM-768 decaps | 24202cycles | 23519cycles | 1.03 | 
| ML-KEM-1024 keypair | 22373cycles | 21866cycles | 1.02 | 
| ML-KEM-1024 encaps | 24761cycles | 23887cycles | 1.04 | 
| ML-KEM-1024 decaps | 31836cycles | 31557cycles | 1.01 | 
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intel Xeon 4th gen (c7i) (no-opt)
| Benchmark suite | Current: 7875c83 | Previous: ae2afe5 | Ratio | 
|---|---|---|---|
| ML-KEM-512 keypair | 28986cycles | 28932cycles | 1.00 | 
| ML-KEM-512 encaps | 34717cycles | 34881cycles | 1.00 | 
| ML-KEM-512 decaps | 43691cycles | 44422cycles | 0.98 | 
| ML-KEM-768 keypair | 49686cycles | 47657cycles | 1.04 | 
| ML-KEM-768 encaps | 55859cycles | 55635cycles | 1.00 | 
| ML-KEM-768 decaps | 69030cycles | 67697cycles | 1.02 | 
| ML-KEM-1024 keypair | 72226cycles | 73141cycles | 0.99 | 
| ML-KEM-1024 encaps | 83760cycles | 85205cycles | 0.98 | 
| ML-KEM-1024 decaps | 100951cycles | 99037cycles | 1.02 | 
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AMD EPYC 3rd gen (c6a)
| Benchmark suite | Current: 7875c83 | Previous: ae2afe5 | Ratio | 
|---|---|---|---|
| ML-KEM-512 keypair | 17504cycles | 17184cycles | 1.02 | 
| ML-KEM-512 encaps | 19545cycles | 18937cycles | 1.03 | 
| ML-KEM-512 decaps | 24487cycles | 24421cycles | 1.00 | 
| ML-KEM-768 keypair | 30100cycles | 29483cycles | 1.02 | 
| ML-KEM-768 encaps | 30900cycles | 30780cycles | 1.00 | 
| ML-KEM-768 decaps | 39086cycles | 39135cycles | 1.00 | 
| ML-KEM-1024 keypair | 44071cycles | 42719cycles | 1.03 | 
| ML-KEM-1024 encaps | 45517cycles | 45712cycles | 1.00 | 
| ML-KEM-1024 decaps | 55994cycles | 56005cycles | 1.00 | 
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intel Xeon 3rd gen (c6i)
| Benchmark suite | Current: 7875c83 | Previous: ae2afe5 | Ratio | 
|---|---|---|---|
| ML-KEM-512 keypair | 16515cycles | 16153cycles | 1.02 | 
| ML-KEM-512 encaps | 18678cycles | 18263cycles | 1.02 | 
| ML-KEM-512 decaps | 25196cycles | 24767cycles | 1.02 | 
| ML-KEM-768 keypair | 28339cycles | 27740cycles | 1.02 | 
| ML-KEM-768 encaps | 29361cycles | 29445cycles | 1.00 | 
| ML-KEM-768 decaps | 39061cycles | 39006cycles | 1.00 | 
| ML-KEM-1024 keypair | 38376cycles | 37549cycles | 1.02 | 
| ML-KEM-1024 encaps | 40710cycles | 40557cycles | 1.00 | 
| ML-KEM-1024 decaps | 53418cycles | 53094cycles | 1.01 | 
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AMD EPYC 4th gen (c7a)
| Benchmark suite | Current: 7875c83 | Previous: ae2afe5 | Ratio | 
|---|---|---|---|
| ML-KEM-512 keypair | 11957cycles | 11570cycles | 1.03 | 
| ML-KEM-512 encaps | 13360cycles | 13356cycles | 1.00 | 
| ML-KEM-512 decaps | 18521cycles | 18228cycles | 1.02 | 
| ML-KEM-768 keypair | 20659cycles | 20133cycles | 1.03 | 
| ML-KEM-768 encaps | 21095cycles | 21112cycles | 1.00 | 
| ML-KEM-768 decaps | 28368cycles | 28834cycles | 0.98 | 
| ML-KEM-1024 keypair | 28059cycles | 27036cycles | 1.04 | 
| ML-KEM-1024 encaps | 29274cycles | 29198cycles | 1.00 | 
| ML-KEM-1024 decaps | 38949cycles | 38762cycles | 1.00 | 
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AMD EPYC 3rd gen (c6a) (no-opt)
| Benchmark suite | Current: 7875c83 | Previous: ae2afe5 | Ratio | 
|---|---|---|---|
| ML-KEM-512 keypair | 39633cycles | 38861cycles | 1.02 | 
| ML-KEM-512 encaps | 47059cycles | 47222cycles | 1.00 | 
| ML-KEM-512 decaps | 60529cycles | 60947cycles | 0.99 | 
| ML-KEM-768 keypair | 63789cycles | 63092cycles | 1.01 | 
| ML-KEM-768 encaps | 73659cycles | 73687cycles | 1.00 | 
| ML-KEM-768 decaps | 91199cycles | 91253cycles | 1.00 | 
| ML-KEM-1024 keypair | 95964cycles | 94519cycles | 1.02 | 
| ML-KEM-1024 encaps | 107920cycles | 108285cycles | 1.00 | 
| ML-KEM-1024 decaps | 131122cycles | 131266cycles | 1.00 | 
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Graviton4
| Benchmark suite | Current: 7875c83 | Previous: ae2afe5 | Ratio | 
|---|---|---|---|
| ML-KEM-512 keypair | 18313cycles | 17773cycles | 1.03 | 
| ML-KEM-512 encaps | 21134cycles | 21031cycles | 1.00 | 
| ML-KEM-512 decaps | 27657cycles | 27698cycles | 1.00 | 
| ML-KEM-768 keypair | 31618cycles | 30678cycles | 1.03 | 
| ML-KEM-768 encaps | 33609cycles | 33530cycles | 1.00 | 
| ML-KEM-768 decaps | 43097cycles | 43120cycles | 1.00 | 
| ML-KEM-1024 keypair | 45922cycles | 44326cycles | 1.04 | 
| ML-KEM-1024 encaps | 49668cycles | 49592cycles | 1.00 | 
| ML-KEM-1024 decaps | 62598cycles | 62596cycles | 1.00 | 
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intel Xeon 3rd gen (c6i) (no-opt)
| Benchmark suite | Current: 7875c83 | Previous: ae2afe5 | Ratio | 
|---|---|---|---|
| ML-KEM-512 keypair | 46363cycles | 46301cycles | 1.00 | 
| ML-KEM-512 encaps | 54353cycles | 54609cycles | 1.00 | 
| ML-KEM-512 decaps | 69621cycles | 69950cycles | 1.00 | 
| ML-KEM-768 keypair | 76805cycles | 75183cycles | 1.02 | 
| ML-KEM-768 encaps | 86887cycles | 86323cycles | 1.01 | 
| ML-KEM-768 decaps | 107145cycles | 106227cycles | 1.01 | 
| ML-KEM-1024 keypair | 112712cycles | 110866cycles | 1.02 | 
| ML-KEM-1024 encaps | 125664cycles | 124926cycles | 1.01 | 
| ML-KEM-1024 decaps | 151403cycles | 150400cycles | 1.01 | 
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AMD EPYC 4th gen (c7a) (no-opt)
| Benchmark suite | Current: 7875c83 | Previous: ae2afe5 | Ratio | 
|---|---|---|---|
| ML-KEM-512 keypair | 36373cycles | 35953cycles | 1.01 | 
| ML-KEM-512 encaps | 42749cycles | 42696cycles | 1.00 | 
| ML-KEM-512 decaps | 55716cycles | 55600cycles | 1.00 | 
| ML-KEM-768 keypair | 59436cycles | 59291cycles | 1.00 | 
| ML-KEM-768 encaps | 67600cycles | 67893cycles | 1.00 | 
| ML-KEM-768 decaps | 84768cycles | 85108cycles | 1.00 | 
| ML-KEM-1024 keypair | 88410cycles | 87394cycles | 1.01 | 
| ML-KEM-1024 encaps | 99840cycles | 99602cycles | 1.00 | 
| ML-KEM-1024 decaps | 121506cycles | 121043cycles | 1.00 | 
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Graviton2
| Benchmark suite | Current: 7875c83 | Previous: ae2afe5 | Ratio | 
|---|---|---|---|
| ML-KEM-512 keypair | 30161cycles | 29388cycles | 1.03 | 
| ML-KEM-512 encaps | 34693cycles | 34636cycles | 1.00 | 
| ML-KEM-512 decaps | 45368cycles | 45238cycles | 1.00 | 
| ML-KEM-768 keypair | 51419cycles | 50110cycles | 1.03 | 
| ML-KEM-768 encaps | 55248cycles | 55310cycles | 1.00 | 
| ML-KEM-768 decaps | 70313cycles | 70193cycles | 1.00 | 
| ML-KEM-1024 keypair | 75179cycles | 73113cycles | 1.03 | 
| ML-KEM-1024 encaps | 81509cycles | 81550cycles | 1.00 | 
| ML-KEM-1024 decaps | 101964cycles | 101575cycles | 1.00 | 
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Graviton4 (no-opt)
| Benchmark suite | Current: 7875c83 | Previous: ae2afe5 | Ratio | 
|---|---|---|---|
| ML-KEM-512 keypair | 36286cycles | 36997cycles | 0.98 | 
| ML-KEM-512 encaps | 42322cycles | 41062cycles | 1.03 | 
| ML-KEM-512 decaps | 52117cycles | 52113cycles | 1.00 | 
| ML-KEM-768 keypair | 60664cycles | 59560cycles | 1.02 | 
| ML-KEM-768 encaps | 66986cycles | 67378cycles | 0.99 | 
| ML-KEM-768 decaps | 81104cycles | 81159cycles | 1.00 | 
| ML-KEM-1024 keypair | 90329cycles | 88492cycles | 1.02 | 
| ML-KEM-1024 encaps | 98610cycles | 98647cycles | 1.00 | 
| ML-KEM-1024 decaps | 117412cycles | 117472cycles | 1.00 | 
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Graviton2 (no-opt)
| Benchmark suite | Current: 7875c83 | Previous: ae2afe5 | Ratio | 
|---|---|---|---|
| ML-KEM-512 keypair | 62258cycles | 59967cycles | 1.04 | 
| ML-KEM-512 encaps | 69660cycles | 67878cycles | 1.03 | 
| ML-KEM-512 decaps | 88983cycles | 86649cycles | 1.03 | 
| ML-KEM-768 keypair | 100123cycles | 98445cycles | 1.02 | 
| ML-KEM-768 encaps | 110013cycles | 110144cycles | 1.00 | 
| ML-KEM-768 decaps | 134805cycles | 134710cycles | 1.00 | 
| ML-KEM-1024 keypair | 151049cycles | 146744cycles | 1.03 | 
| ML-KEM-1024 encaps | 164226cycles | 162656cycles | 1.01 | 
| ML-KEM-1024 decaps | 196168cycles | 194511cycles | 1.01 | 
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Graviton3
| Benchmark suite | Current: 7875c83 | Previous: ae2afe5 | Ratio | 
|---|---|---|---|
| ML-KEM-512 keypair | 19370cycles | 18891cycles | 1.03 | 
| ML-KEM-512 encaps | 22352cycles | 22381cycles | 1.00 | 
| ML-KEM-512 decaps | 29622cycles | 29596cycles | 1.00 | 
| ML-KEM-768 keypair | 33266cycles | 32333cycles | 1.03 | 
| ML-KEM-768 encaps | 35787cycles | 35759cycles | 1.00 | 
| ML-KEM-768 decaps | 46150cycles | 46117cycles | 1.00 | 
| ML-KEM-1024 keypair | 47840cycles | 46414cycles | 1.03 | 
| ML-KEM-1024 encaps | 51996cycles | 52074cycles | 1.00 | 
| ML-KEM-1024 decaps | 65945cycles | 65931cycles | 1.00 | 
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Graviton3 (no-opt)
| Benchmark suite | Current: 7875c83 | Previous: ae2afe5 | Ratio | 
|---|---|---|---|
| ML-KEM-512 keypair | 39352cycles | 39577cycles | 0.99 | 
| ML-KEM-512 encaps | 46079cycles | 44688cycles | 1.03 | 
| ML-KEM-512 decaps | 56791cycles | 56440cycles | 1.01 | 
| ML-KEM-768 keypair | 65247cycles | 64260cycles | 1.02 | 
| ML-KEM-768 encaps | 71693cycles | 72864cycles | 0.98 | 
| ML-KEM-768 decaps | 87621cycles | 87845cycles | 1.00 | 
| ML-KEM-1024 keypair | 97088cycles | 95809cycles | 1.01 | 
| ML-KEM-1024 encaps | 106917cycles | 106649cycles | 1.00 | 
| ML-KEM-1024 decaps | 126962cycles | 127051cycles | 1.00 | 
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bananapi bpi-f3 benchmarks
| Benchmark suite | Current: 4c5b57e | Previous: 3dc9642 | Ratio | 
|---|---|---|---|
| ML-KEM-512 keypair | 329727cycles | 331334cycles | 1.00 | 
| ML-KEM-512 encaps | 308683cycles | 439896cycles | 0.70 | 
| ML-KEM-512 decaps | 472974cycles | 588436cycles | 0.80 | 
| ML-KEM-768 keypair | 546924cycles | 548599cycles | 1.00 | 
| ML-KEM-768 encaps | 433011cycles | 688259cycles | 0.63 | 
| ML-KEM-768 decaps | 648468cycles | 880050cycles | 0.74 | 
| ML-KEM-1024 keypair | 812647cycles | 814706cycles | 1.00 | 
| ML-KEM-1024 encaps | 564190cycles | 988517cycles | 0.57 | 
| ML-KEM-1024 decaps | 830107cycles | 1222676cycles | 0.68 | 
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arm Cortex-A55 (Snapdragon 888) benchmarks
| Benchmark suite | Current: 7875c83 | Previous: ae2afe5 | Ratio | 
|---|---|---|---|
| ML-KEM-512 keypair | 61875cycles | 59335cycles | 1.04 | 
| ML-KEM-512 encaps | 67252cycles | 66971cycles | 1.00 | 
| ML-KEM-512 decaps | 85958cycles | 86021cycles | 1.00 | 
| ML-KEM-768 keypair | 108779cycles | 101262cycles | 1.07 | 
| ML-KEM-768 encaps | 112490cycles | 112143cycles | 1.00 | 
| ML-KEM-768 decaps | 139101cycles | 139861cycles | 0.99 | 
| ML-KEM-1024 keypair | 163183cycles | 152966cycles | 1.07 | 
| ML-KEM-1024 encaps | 171680cycles | 174266cycles | 0.99 | 
| ML-KEM-1024 decaps | 209538cycles | 210791cycles | 0.99 | 
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arm Cortex-A72 (Raspberry Pi 4) benchmarks
| Benchmark suite | Current: 3898888 | Previous: 3756ba9 | Ratio | 
|---|---|---|---|
| ML-KEM-512 keypair | 54741cycles | 53509cycles | 1.02 | 
| ML-KEM-512 encaps | 61491cycles | 61512cycles | 1.00 | 
| ML-KEM-512 decaps | 78019cycles | 77960cycles | 1.00 | 
| ML-KEM-768 keypair | 93583cycles | 90628cycles | 1.03 | 
| ML-KEM-768 encaps | 98001cycles | 98387cycles | 1.00 | 
| ML-KEM-768 decaps | 122484cycles | 122120cycles | 1.00 | 
| ML-KEM-1024 keypair | 142001cycles | 135148cycles | 1.05 | 
| ML-KEM-1024 encaps | 147592cycles | 148348cycles | 0.99 | 
| ML-KEM-1024 decaps | 180978cycles | 181704cycles | 1.00 | 
This comment was automatically generated by workflow using github-action-benchmark.
| @hanno-becker - any thoughts on this? | 
| @mkannwischer What stands in the way of keeping the old API and providing the new one as an (optional) addition? | 
3ca90c7    to
    1f56ed1      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is going in a good direction I think. You may defer updating mlkem_native.h for the time being, keeping that to the core API, and using kem.h directly from the test exercising the new API.
0ab73bb    to
    c1e1a00      
    Compare
  
    Signed-off-by: Matthias J. Kannwischer <[email protected]>
Signed-off-by: Matthias J. Kannwischer <[email protected]>
Signed-off-by: Matthias J. Kannwischer <[email protected]>
Signed-off-by: Matthias J. Kannwischer <[email protected]>
| #define mlk_indcpa_secret_key MLK_NAMESPACE_K(mlk_indcpa_secret_key) | ||
| typedef struct | ||
| { | ||
| mlk_polyvec skpv; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you considered caching the mulcache? This would be useful in decapsulation.
| * - We use a different implementation of `gen_matrix()` which | ||
| * uses x4-batched Keccak-f1600 (see `mlk_gen_matrix()` above). | ||
| * - We use a mulcache to speed up matrix-vector multiplication. | ||
| * - We include buffer zeroization. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All Reference: .. comments need extending
| MLK_CT_TESTING_DECLASSIFY(publicseed, MLKEM_SYMBYTES); | ||
|  | ||
| mlk_gen_matrix(a, publicseed, 0 /* no transpose */); | ||
| mlk_gen_matrix(pk->at, publicseed, 0 /* no transpose */); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you leave a comment that the matrix will be transposed later? Otherwise, it's confusing why the variable is named at.
| mlk_polyvec_add(pkpv, e); | ||
| mlk_polyvec_reduce(pkpv); | ||
| mlk_polyvec_reduce(skpv); | ||
| mlk_transpose_matrix(pk->at); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you leave a comment what the matrix is transposed for at this point?
| requires(memory_no_alias(seed, MLKEM_SYMBYTES)) | ||
| requires(transposed == 0 || transposed == 1) | ||
| assigns(object_whole(a)) | ||
| assigns(memory_slice(a, sizeof(mlk_poly) * MLKEM_K * MLKEM_K)) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: sizeof(mlk_polymat)?
| requires(forall(k0, 0, MLKEM_K, | ||
| array_bound(pk->pkpv[k0].coeffs, 0, MLKEM_N, 0, MLKEM_UINT12_LIMIT))) | ||
| requires(forall(x, 0, MLKEM_K * MLKEM_K, | ||
| array_bound(pk->at[x].coeffs, 0, MLKEM_N, 0, MLKEM_Q))) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| array_bound(pk->at[x].coeffs, 0, MLKEM_N, 0, MLKEM_Q))) | |
| array_bound(pk->at[x].coeffs, 0, MLKEM_N, 0, MLKEM_Q))) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⚠️  Performance Alert ⚠️ 
Possible performance regression was detected for benchmark 'Intel Xeon 4th gen (c7i)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.
| Benchmark suite | Current: 7875c83 | Previous: ae2afe5 | Ratio | 
|---|---|---|---|
| ML-KEM-1024 encaps | 24761cycles | 23887cycles | 1.04 | 
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⚠️  Performance Alert ⚠️ 
Possible performance regression was detected for benchmark 'Intel Xeon 4th gen (c7i) (no-opt)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.
| Benchmark suite | Current: 7875c83 | Previous: ae2afe5 | Ratio | 
|---|---|---|---|
| ML-KEM-768 keypair | 49686cycles | 47657cycles | 1.04 | 
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⚠️  Performance Alert ⚠️ 
Possible performance regression was detected for benchmark 'AMD EPYC 4th gen (c7a)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.
| Benchmark suite | Current: 7875c83 | Previous: ae2afe5 | Ratio | 
|---|---|---|---|
| ML-KEM-512 keypair | 11957cycles | 11570cycles | 1.03 | 
| ML-KEM-1024 keypair | 28059cycles | 27036cycles | 1.04 | 
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⚠️  Performance Alert ⚠️ 
Possible performance regression was detected for benchmark 'AMD EPYC 3rd gen (c6a)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.
| Benchmark suite | Current: 7875c83 | Previous: ae2afe5 | Ratio | 
|---|---|---|---|
| ML-KEM-512 encaps | 19545cycles | 18937cycles | 1.03 | 
| ML-KEM-1024 keypair | 44071cycles | 42719cycles | 1.03 | 
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⚠️  Performance Alert ⚠️ 
Possible performance regression was detected for benchmark 'Graviton4'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.
| Benchmark suite | Current: 7875c83 | Previous: ae2afe5 | Ratio | 
|---|---|---|---|
| ML-KEM-512 keypair | 18313cycles | 17773cycles | 1.03 | 
| ML-KEM-768 keypair | 31618cycles | 30678cycles | 1.03 | 
| ML-KEM-1024 keypair | 45922cycles | 44326cycles | 1.04 | 
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⚠️  Performance Alert ⚠️ 
Possible performance regression was detected for benchmark 'Graviton3'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.
| Benchmark suite | Current: 7875c83 | Previous: ae2afe5 | Ratio | 
|---|---|---|---|
| ML-KEM-1024 keypair | 47840cycles | 46414cycles | 1.03 | 
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⚠️  Performance Alert ⚠️ 
Possible performance regression was detected for benchmark 'Arm Cortex-A55 (Snapdragon 888) benchmarks'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.
| Benchmark suite | Current: 7875c83 | Previous: ae2afe5 | Ratio | 
|---|---|---|---|
| ML-KEM-512 keypair | 61875cycles | 59335cycles | 1.04 | 
| ML-KEM-768 keypair | 108779cycles | 101262cycles | 1.07 | 
| ML-KEM-1024 keypair | 163183cycles | 152966cycles | 1.07 | 
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⚠️  Performance Alert ⚠️ 
Possible performance regression was detected for benchmark 'Graviton4 (no-opt)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.
| Benchmark suite | Current: 7875c83 | Previous: ae2afe5 | Ratio | 
|---|---|---|---|
| ML-KEM-512 encaps | 42322cycles | 41062cycles | 1.03 | 
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⚠️  Performance Alert ⚠️ 
Possible performance regression was detected for benchmark 'Graviton3 (no-opt)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.
| Benchmark suite | Current: 7875c83 | Previous: ae2afe5 | Ratio | 
|---|---|---|---|
| ML-KEM-512 encaps | 46079cycles | 44688cycles | 1.03 | 
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⚠️  Performance Alert ⚠️ 
Possible performance regression was detected for benchmark 'Graviton2 (no-opt)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.
| Benchmark suite | Current: 7875c83 | Previous: ae2afe5 | Ratio | 
|---|---|---|---|
| ML-KEM-512 keypair | 62258cycles | 59967cycles | 1.04 | 
This comment was automatically generated by workflow using github-action-benchmark.
| ensures(array_bound(data, 0, MLKEM_N, 0, MLKEM_Q))) { ((void)data); } | ||
| #endif /* !MLK_USE_NATIVE_NTT_CUSTOM_ORDER */ | ||
|  | ||
| static void mlk_transpose_matrix(mlk_polymat a) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add this to the component benchmarks so we get a sense of the performance cost?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mac Mini (M1, 2020) benchmarks
| Benchmark suite | Current: 7875c83 | Previous: ae2afe5 | Ratio | 
|---|---|---|---|
| ML-KEM-512 keypair | 12686cycles | 12245cycles | 1.04 | 
| ML-KEM-512 encaps | 14877cycles | 15104cycles | 0.98 | 
| ML-KEM-512 decaps | 19629cycles | 19571cycles | 1.00 | 
| ML-KEM-768 keypair | 21872cycles | 21036cycles | 1.04 | 
| ML-KEM-768 encaps | 23731cycles | 23653cycles | 1.00 | 
| ML-KEM-768 decaps | 30584cycles | 30519cycles | 1.00 | 
| ML-KEM-1024 keypair | 31317cycles | 29972cycles | 1.04 | 
| ML-KEM-1024 encaps | 34603cycles | 34477cycles | 1.00 | 
| ML-KEM-1024 decaps | 43472cycles | 43694cycles | 0.99 | 
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⚠️  Performance Alert ⚠️ 
Possible performance regression was detected for benchmark 'Mac Mini (M1, 2020) benchmarks'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03.
| Benchmark suite | Current: 7875c83 | Previous: ae2afe5 | Ratio | 
|---|---|---|---|
| ML-KEM-512 keypair | 12686cycles | 12245cycles | 1.04 | 
| ML-KEM-768 keypair | 21872cycles | 21036cycles | 1.04 | 
| ML-KEM-1024 keypair | 31317cycles | 29972cycles | 1.04 | 
This comment was automatically generated by workflow using github-action-benchmark.
| * copy over indcpa pk and H(pk) to public key | ||
| * pk is NULL during parsing before decaps as the pk is not needed | ||
| **/ | ||
| if (pk != NULL) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CBMC contract says pk must be valid.
This PR rebases an old proposal for extending the API to allow operating on validated+expanded keys (splitting out the serialization and deserialization into separate functions which contain the input validation).
Benefit of this is that in case you can keep the keys expanded, you get much better performance. This is particularly useful for an ephemeral use-case (e.g., TLS), where the secret key never has to leave memory.
The benchmarks below show why we should consider doing this: Decapsulation gets up to 3x faster if you can cache the expanded secret key from key generation. (Encapsulation gets even up to 5x faster if you can cache, but I don't think this is useful for any major use case of ML-KEM).
See pq-code-package/tsc#4 (comment) for the overview