aws · jakemas · Dec 12, 2025 · Dec 12, 2025 · Dec 12, 2025 · Dec 12, 2025
@@ -0,0 +1,5 @@
+name: mldsa-native
+source: pq-code-package/mldsa-native.git
+branch: main
+commit: b61e84f0c73d4ed612ffcaea4282a9d682de3f46
+imported-at: 2026-01-16T13:12:01-0800
@@ -0,0 +1,149 @@
+# ML-DSA
+
+The source code in this directory implements ML-DSA as defined in
+the [FIPS 204 Module-Lattice-Based Digital Signature Standard](https://csrc.nist.gov/pubs/fips/204/final).
+It is imported from [mldsa-native](https://github.com/pq-code-package/mldsa-native)
+using [importer.sh](importer.sh); see [META.yml](META.yml) for import details.
+
+## Running the importer
+
+To re-run the importer, do
+
+```bash
+rm -rf mldsa # Remove old mldsa source
+./importer.sh
+```
+
+By default, the importer will not run if [mldsa](mldsa) already/still exists. To force removal of any existing [mldsa](mldsa), use `./importer.sh --force`.
+
+The repository and branch to be used for the import can be configured through the environment variables `GITHUB_REPOSITORY` and `GITHUB_SHA`, respectively. The default is equivalent to
+
+```bash
+GITHUB_REPOSITORY=pq-code-package/mldsa-native.git GITHUB_SHA=main ./importer.sh
+```
+
+That is, by default importer.sh will clone and install the latest [main](https://github.com/pq-code-package/mldsa-native/tree/main) of mldsa-native.
+
+After a successful import, [META.yml](META.yml) will reflect the source, branch, commit and timestamp of the import.
+
+### Import Scope
+
+mldsa-native has a C-only version as well as native 'backends' in AVX2 and
+Neon for high performance. At present, [importer.sh](importer.sh) imports only
+the C-only version.
+
+mldsa-native offers its own FIPS-202 implementation, including fast
+versions of batched FIPS-202. [importer.sh](importer.sh) does _not_ import those.
+Instead, glue-code around AWS-LC's own FIPS-202 implementation is provided in
+[fips202_glue.h](fips202_glue.h) and [fips202x4_glue.h](fips202x4_glue.h).
+
+## Configuration and compatibility layer
+
+mldsa-native is used with a custom configuration file [mldsa_native_config.h](mldsa_native_config.h). This file includes
+a compatibility layer between AWS-LC/OpenSSL and mldsa-native, covering:
+
+* FIPS/PCT: If `AWSLC_FIPS` is set, `MLD_CONFIG_KEYGEN_PCT` is
+  enabled to include a PCT.
+* FIPS/PCT: If `BORINGSSL_FIPS_BREAK_TESTS` is set,
+  `MLD_CONFIG_KEYGEN_PCT_BREAKAGE_TEST` is set and `mld_break_pct`
+  defined via `boringssl_fips_break_test("MLDSA_PWCT")`, to include
+  runtime-breakage of the PCT for testing purposes.
+* CT: If `BORINGSSL_CONSTANT_TIME_VALIDATION` is set, then
+  `MLD_CONFIG_CT_TESTING_ENABLED` is set to enable valgrind testing.
+* Zeroization: `MLD_CONFIG_CUSTOM_ZEROIZE` is set and `mld_zeroize`
+  mapped to `OPENSSL_cleanse` to use OpenSSL's zeroization function.
+* Randombytes: `MLD_CONFIG_CUSTOM_RANDOMBYTES` is set and `mld_randombytes`
+  mapped to `RAND_bytes` to use AWS-LC's randombytes function.
+
+## Build process
+
+At the core, mldsa-native is a 'single-level' implementation of ML-DSA:
+A build of the main source tree provides an implementation of
+exactly one of ML-DSA-44/65/87, depending on the MLD_CONFIG_PARAMETER_SET
+parameter. All source files for a single-build of mldsa-native are bundled in
+[mldsa_native_bcm.c](mldsa/mldsa_native_bcm.c), which is also imported from
+mldsa-native.
+
+To build all security levels, [mldsa_native_bcm.c](mldsa/mldsa_native_bcm.c)
+is included three times into [ml_dsa.c](ml_dsa.c), once per security level.
+Level-independent code is included only once and shared across the levels;
+this is controlled through the configuration options
+`MLD_CONFIG_MULTILEVEL_WITH_SHARED` and `MLD_CONFIG_MULTILEVEL_NO_SHARED`
+used prior to importing the instances of [mldsa_native_bcm.c](mldsa/mldsa_native_bcm.c) into [ml_dsa.c](ml_dsa.c).
+
+Note that the multilevel build process is entirely internal to `ml_dsa.c`,
+and does not affect the AWS-LC build otherwise.
+
+## Formal Verification
+
+All C-code imported by [importer.sh](importer.sh) is formally verified using the
+C Bounded Model Checker ([CBMC](https://github.com/diffblue/cbmc/)) to be free of
+various classes of undefined behaviour, including out-of-bounds memory accesses and
+arithmetic overflow; the latter is of particular interest for ML-DSA because of
+the use of lazy modular reduction for improved performance.
+
+The heart of the CBMC proofs are function contract and loop annotations to
+the C-code. Function contracts are denoted `__contract__(...)` clauses and
+occur at the time of declaration, while loop contracts are denoted
+`__loop__` and follow the `for` statement.
+
+The function contract and loop statements are kept in the source, but
+removed by the preprocessor so long as the CBMC macro is undefined. Keeping
+them simplifies the import, and care has been taken to make them readable
+to the non-expert, and thereby serve as precise documentation of
+assumptions and guarantees upheld by the code.
+
+## Testing
+
+We test ML-DSA with Known Answer Test (KAT) vectors obtained from https://github.com/post-quantum-cryptography/KAT within `PQDSAParameterTest.KAT`. We select the KATs for the signing mode `hedged`, which derives the signing private random seed (rho) pseudorandomly from the signer's private key, the message to be signed, and a 256-bit string `rnd` which is generated at random. The `pure` variant of these KATs were used, as they provide test vector inputs for "pure" i.e., non-pre-hashed messages. The KAT files have been modified to insert linebreaks between each test vector set.
+
+We also run the ACVP test vectors obtained from https://github.com/usnistgov/ACVP-Server within the three functions `PerMLDSATest.ACVPKeyGen`, `PerMLDSATest.ACVPSigGen` and `PerMLDSATest.ACVPSigVer`. These correspond to the tests found at [ML-DSA-keyGen-FIPS204](https://github.com/usnistgov/ACVP-Server/tree/master/gen-val/json-files/ML-DSA-keyGen-FIPS204), [ML-DSA-sigGen-FIPS204](https://github.com/usnistgov/ACVP-Server/tree/master/gen-val/json-files/ML-DSA-sigGen-FIPS204), and [ML-DSA-sigVer-FIPS204](https://github.com/usnistgov/ACVP-Server/tree/master/gen-val/json-files/ML-DSA-sigVer-FIPS204).
+To test ML-DSA pure, non-deterministic mode, we use `tgId = 19, 21, 23` of sigGen and `tgId = 7, 9, 11` of sigVer.
+To test ML-DSA ExternalMu, non-deterministic mode, we use `tgId = 20, 22, 24` of sigGen and `tgId = 8, 10, 12` of sigVer.
+
+The test suite includes:
   /* Randomized variant of ML-DSA. If you need the deterministic variant, 
    * call crypto_sign_signature_internal directly with all-zero rnd. */ 
   /* Randomized variant of ML-DSA. If you need the deterministic variant, 
    * call crypto_sign_signature_internal directly with all-zero rnd. */ 
+
+* Known Answer Tests (KAT) for all three parameter sets (ML-DSA-44/65/87)
+* Functional tests for key generation, signing, and verification
+* ExtMu (External Mu) variant tests for pre-hash modes
+* ACVP (Automated Cryptographic Validation Protocol) test vectors
+* Pairwise Consistency Test (PCT) validation when FIPS mode is enabled
+* Key consistency tests including public key derivation from secret key
+
+## Side-channels
+
+mldsa-native's CI uses a patched version of valgrind to check for various
+compilers and compile flags that there are no secret-dependent memory
+accesses, branches, or divisions. The relevant assertions are kept
+and used if `MLD_CONFIG_CT_TESTING_ENABLED` is set, which is the case
+if and only if `BORINGSSL_CONSTANT_TIME_VALIDATION` is set.
+
+mldsa-native uses value barriers to block
+potentially harmful compiler reasoning and optimization. Where standard
+gcc/clang inline assembly is not available, mldsa-native falls back to a
+slower 'opt blocker' based on a volatile global -- both are described in
+[ct.h](https://github.com/pq-code-package/mldsa-native/blob/main/mldsa/ct.h).
+
+## Comparison to reference implementation
+
+mldsa-native is a fork of the ML-DSA [reference
+implementation](https://github.com/pq-crystals/dilithium) (Dilithium).
+
+The following gives an overview of the major changes:
+
+- CBMC and debug annotations, and minor code restructurings or signature
+  changes to facilitate the CBMC proofs. For example, functions are structured
+  to make loop bounds and memory access patterns explicit for formal verification.
+- Introduction of 4x-batched versions of some functions from the reference
+  implementation. This is to leverage 4x-batched Keccak-f1600 implementations
+  if present. The batching happens at the C level even if no native backend
+  for FIPS 202 is present.
+- FIPS 204 compliance: Introduced optional PCT (FIPS 204, Section 4.4, Pairwise
+  Consistency) and zeroization of stack buffers as required by (FIPS 204, 
+  Section 3.6.3, Destruction of intermediate values).
+- Restructuring of files to separate level-specific from level-generic
+  functionality. This is needed to enable a multi-level build of mldsa-native
+  where level-generic code is shared between levels.
+- More pervasive use of value barriers to harden constant-time primitives,
+  even when Link-Time-Optimization (LTO) is enabled. The use of LTO can lead
+  to insecure compilation in case of the reference implementation.
@@ -0,0 +1,122 @@
+// Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
+// SPDX-License-Identifier: Apache-2.0 OR ISC
+
+#ifndef MLD_AWSLC_FIPS202_GLUE_H
+#define MLD_AWSLC_FIPS202_GLUE_H
+#include <stddef.h>
+#include <stdint.h>
+
+#include "../sha/internal.h"
+
+#define SHAKE128_RATE 168
+#define SHAKE256_RATE 136
+#define SHA3_256_RATE 136
+#define SHA3_512_RATE 72
+
+#define mld_shake128ctx KECCAK1600_CTX
+#define mld_shake256ctx KECCAK1600_CTX
+
+static MLD_INLINE void mld_shake128_init(mld_shake128ctx *state) {
+  // Return code checks can be omitted
+  // SHAKE_Init always returns 1 when called with correct block size value.
+  (void) SHAKE_Init(state, SHAKE128_BLOCKSIZE);
+}
+
+static MLD_INLINE void mld_shake128_release(mld_shake128ctx *state) {
+  (void) state;
+}
+
+static MLD_INLINE void mld_shake128_absorb_once(mld_shake128ctx *state,
+						const uint8_t *input, size_t inlen) {
+  // Return code check can be omitted
+  // since mldsa-native adheres to call discipline
+  (void) SHAKE_Absorb(state, input, inlen);
+}
+
+static MLD_INLINE void mld_shake128_absorb(mld_shake128ctx *state,
+					   const uint8_t *input, size_t inlen) {
+  (void) SHAKE_Absorb(state, input, inlen);
+}
+
+static MLD_INLINE void mld_shake128_finalize(mld_shake128ctx *state) {
+  // Finalization is implicit in AWS-LC's implementation
+  // The state is ready for squeezing after absorb
+  (void) state;
+}
+
+static MLD_INLINE void mld_shake128_squeeze(uint8_t *output, size_t outlen,
+					    mld_shake128ctx *state) {
+  (void) SHAKE_Squeeze(output, state, outlen);
+}
+
+static MLD_INLINE void mld_shake128_squeezeblocks(uint8_t *output, size_t nblocks,
+						  mld_shake128ctx *state) {
+  // Return code check can be omitted
+  // since mldsa-native adheres to call discipline
+  (void) SHAKE_Squeeze(output, state, nblocks * SHAKE128_RATE);
+}
+
+static MLD_INLINE void mld_shake256_init(mld_shake256ctx *state) {
+  // Return code checks can be omitted
+  // SHAKE_Init always returns 1 when called with correct block size value.
+  (void) SHAKE_Init(state, SHAKE256_BLOCKSIZE);
+}
+
+static MLD_INLINE void mld_shake256_release(mld_shake256ctx *state) {
+  (void) state;
+}
+
+static MLD_INLINE void mld_shake256_absorb_once(mld_shake256ctx *state,
+						const uint8_t *input, size_t inlen) {
+  // Return code check can be omitted
+  // since mldsa-native adheres to call discipline
+  (void) SHAKE_Absorb(state, input, inlen);
+}
+
+static MLD_INLINE void mld_shake256_absorb(mld_shake256ctx *state,
+					   const uint8_t *input, size_t inlen) {
+  // Return code check can be omitted
+  // since mldsa-native adheres to call discipline
+  (void) SHAKE_Absorb(state, input, inlen);
+}
+
+static MLD_INLINE void mld_shake256_finalize(mld_shake256ctx *state) {
+  // Finalization is implicit in AWS-LC's implementation
+  // The state is ready for squeezing after absorb
+  (void) state;
+}
+
+static MLD_INLINE void mld_shake256_squeeze(uint8_t *output, size_t outlen,
+					    mld_shake256ctx *state) {
+  (void) SHAKE_Squeeze(output, state, outlen);
+}
+
+static MLD_INLINE void mld_shake256_squeezeblocks(uint8_t *output, size_t nblocks,
+						  mld_shake256ctx *state) {
+  // Return code check can be omitted
+  // since mldsa-native adheres to call discipline
+  (void) SHAKE_Squeeze(output, state, nblocks * SHAKE256_RATE);
+}
+
+static MLD_INLINE void mld_shake256(uint8_t *output, size_t outlen,
+				    const uint8_t *input, size_t inlen) {
+  // Return code check can be omitted
+  // since mldsa-native adheres to call discipline
+  (void) SHAKE256(input, inlen, output, outlen);
+}
+
+static MLD_INLINE void mld_sha3_256(uint8_t *output, const uint8_t *input,
+				    size_t inlen) {
+  // Return code check can be omitted
+  // since mldsa-native adheres to call discipline
+  (void) SHA3_256(input, inlen, output);
+}
+
+static MLD_INLINE void mld_sha3_512(uint8_t *output, const uint8_t *input,
+				    size_t inlen) {
+  // Return code check can be omitted
+  // since mldsa-native adheres to call discipline
+  (void) SHA3_512(input, inlen, output);
+}
+
+#endif // MLD_AWSLC_FIPS202_GLUE_H
@@ -0,0 +1,106 @@
+// Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
+// SPDX-License-Identifier: Apache-2.0 OR ISC
+
+//
+// This is a shim establishing the FIPS-202 API required by
+// mldsa-native from the API exposed by AWS-LC.
+//
+
+#ifndef MLD_AWSLC_FIPS202X4_GLUE_H
+#define MLD_AWSLC_FIPS202X4_GLUE_H
+
+#include <stddef.h>
+#include <stdint.h>
+
+#include "fips202_glue.h"
+
+// Use AWS-LC's existing KECCAK1600_CTX_x4 structure for SHAKE128
+#define mld_shake128x4ctx KECCAK1600_CTX_x4
+
+// For SHAKE256 x4, we need a custom structure since AWS-LC only has batched SHAKE128
+typedef struct mld_shake256x4ctx_s {
+  KECCAK1600_CTX s[4];
+} mld_shake256x4ctx;
+
+static MLD_INLINE void mld_shake128x4_absorb_once(mld_shake128x4ctx *state,
+						  const uint8_t *in0,
+						  const uint8_t *in1,
+						  const uint8_t *in2,
+						  const uint8_t *in3, size_t inlen) {
+  // Return code check can be omitted
+  // since mldsa-native adheres to call discipline
+  (void) SHAKE128_Absorb_once_x4(state, in0, in1, in2, in3, inlen);
+}
+
+static MLD_INLINE void mld_shake128x4_squeezeblocks(uint8_t *out0, uint8_t *out1,
+						    uint8_t *out2, uint8_t *out3,
+						    size_t nblocks,
+						    mld_shake128x4ctx *state) {
+  // Return code check can be omitted
+  // since mldsa-native adheres to call discipline
+  (void) SHAKE128_Squeezeblocks_x4(out0, out1, out2, out3, state, nblocks);
+}
+
+static MLD_INLINE void mld_shake128x4_init(mld_shake128x4ctx *state) {
+  // Return code check can be omitted
+  // since mldsa-native adheres to call discipline
+  (void) SHAKE128_Init_x4(state);
+}
+
+static MLD_INLINE void mld_shake128x4_release(mld_shake128x4ctx *state) {
+  (void) state;
+}
+
+// AWS-LC doesn't have SHAKE256 x4 batched operations like it does for SHAKE128
+// We provide serial implementations that process each instance separately
+static MLD_INLINE void mld_shake256x4_absorb_once(mld_shake256x4ctx *state,
+						  const uint8_t *in0,
+						  const uint8_t *in1,
+						  const uint8_t *in2,
+						  const uint8_t *in3, size_t inlen) {
+  // Process four independent SHAKE256 operations serially
+  mld_shake256_init(&state->s[0]);
+  mld_shake256_absorb_once(&state->s[0], in0, inlen);
+  mld_shake256_init(&state->s[1]);
+  mld_shake256_absorb_once(&state->s[1], in1, inlen);
+  mld_shake256_init(&state->s[2]);
+  mld_shake256_absorb_once(&state->s[2], in2, inlen);
+  mld_shake256_init(&state->s[3]);
+  mld_shake256_absorb_once(&state->s[3], in3, inlen);
+}
+
+static MLD_INLINE void mld_shake256x4_squeezeblocks(uint8_t *out0, uint8_t *out1,
+						    uint8_t *out2, uint8_t *out3,
+						    size_t nblocks,
+						    mld_shake256x4ctx *state) {
+  // Process four independent squeeze operations serially
+  mld_shake256_squeezeblocks(out0, nblocks, &state->s[0]);
+  mld_shake256_squeezeblocks(out1, nblocks, &state->s[1]);
+  mld_shake256_squeezeblocks(out2, nblocks, &state->s[2]);
+  mld_shake256_squeezeblocks(out3, nblocks, &state->s[3]);
+}
+
+static MLD_INLINE void mld_shake256x4_init(mld_shake256x4ctx *state) {
+  // Initialize four independent states
+  mld_shake256_init(&state->s[0]);
+  mld_shake256_init(&state->s[1]);
+  mld_shake256_init(&state->s[2]);
+  mld_shake256_init(&state->s[3]);
+}
+
+static MLD_INLINE void mld_shake256x4_release(mld_shake256x4ctx *state) {
+  (void) state;
+}
+
+static MLD_INLINE void mld_shake256x4(uint8_t *out0, uint8_t *out1, uint8_t *out2,
+				      uint8_t *out3, size_t outlen, uint8_t *in0,
+				      uint8_t *in1, uint8_t *in2, uint8_t *in3,
+				      size_t inlen) {
+  // Process four independent SHAKE256 operations serially
+  mld_shake256(out0, outlen, in0, inlen);
+  mld_shake256(out1, outlen, in1, inlen);
+  mld_shake256(out2, outlen, in2, inlen);
+  mld_shake256(out3, outlen, in3, inlen);
+}
+
+#endif // MLD_AWSLC_FIPS202X4_GLUE_H