ML-KEM: usage of seed for key pair generation

1,867 views
Skip to first unread message

Stephan Mueller

unread,
Aug 28, 2024, 10:20:09 AMAug 28
to pqc-forum
Hi,

FIPS 203 section 3.3 allows the storing of the seed data used for the ML-
KEM.KeyGen_internal in lieu of the actual key pair itself. The ML-KEM key pair
can be re-constructed from that seed using the ML-KEM.KeyGen_internal at a
later time.

Algorithm 19 defines ML-KEM.KeyGen requiring the generation of 2 random
numbers for d and z, each having 32 bytes in lines 1 and 2. I.e. this is the
seed that is allowed to be stored in lieu of the actual key pair (or the
decapsulation key at least).

Section 3.3 defines that the term "m <- B^32" as used in lines 1 and 2 of
Algorithm 19 specifies the generation of 32 bytes generated from the RBG.

This implies that the standard requires the seed to be ultimately requiring 64
bytes of data from the RBG.

The question I would have is the following: Considering section 3.3 the RBG
security strength is 128/192/256 bits for ML-KEM 512/768/1024, can the
aforementioned generation of 64 bytes of data from an RBG be replaced with
generating the security strength size data from the RBG and using an approved
KDF (SP800-108 / SP800-56C) to enlarge the data to the requested size of 64
bytes?

If yes, that would imply that the actual seed to be stored is only of the
security strength size of the used ML-KEM instead of 64 bytes.

The basis for the argument is that the output of the RBG is min(security
strength, generated data size). Considering that for ML-KEM the seed data is
always strictly larger than the security strength of the RBG, the resulting
data is still only having the RBG security strength. Thus, the data
"stretching" from the security strength to the ML-KEM seed size
cryptographically is requivalent if that stretching comes from the RBG or from
a KDF. Naturally, this implies that the used KDF has the same or higher
security strength than the used RBG.

Ciao
Stephan


Tony Arcieri

unread,
Sep 6, 2024, 8:29:32 AMSep 6
to pqc-forum, Stephan Mueller
On Wednesday, August 28, 2024 at 8:20:09 AM UTC-6 Stephan Mueller wrote:
The question I would have is the following: Considering section 3.3 the RBG
security strength is 128/192/256 bits for ML-KEM 512/768/1024, can the
aforementioned generation of 64 bytes of data from an RBG be replaced with
generating the security strength size data from the RBG and using an approved
KDF (SP800-108 / SP800-56C) to enlarge the data to the requested size of 64
bytes?

I believe that was explored in section 3.3 of https://eprint.iacr.org/2024/523.pdf

"The serialization format of the private key can in general be seen as a cache of the output of KeyGen under a given seed. If deserialization instead recomputed the whole private key via the initial seed, computing a malformed private key would be impossible. This option is also the only option to obtain MAL-BIND-K-PK, as long as z is not drawn independently of the rest of the seed. Unfortunately, this would require further modification of ML-KEM, computing σ, ρ, and z from a single seed being used in a KDF instead of, as is currently done, using two separate seeds."

Stephan Mueller

unread,
Sep 6, 2024, 4:10:38 PMSep 6
to Tony Arcieri, Phillip Hallam-Baker, pqc-forum
Am Freitag, 6. September 2024, 14:53:44 GMT-5 schrieb Phillip Hallam-Baker:

Hi Phillip,

> OK so here is the attack I am considering:
>
> 1) Consider a sealed HSM that generates a series of public keypairs for
> different purposes.
>
> 2) HSM has a 256 bit internal primary seed s which is used to generate
> seeds in combination with a counter sn = KDF (seed, count++)
>
> 3) HSM generates the private key first and then creates a series of public
> keys from random noise seeds until it finds one that matches the criteria
> that if you take an HMAC of the public key using a 'recovery secret' value
> r, the first 8 bits of the result match one byte in the seed s, and the
> next 8 are the byte 0_000xxxxx where xxxxx gives the location of the byte
> in the seed. This will typically take 65K attempts.
>
> So the attacker simply watches and waits for public keys to be published.
> Each time a key is published, they first check to see if it has the signal
> flag saying it is a recovery secret and if so, they have a candidate byte
> in the seed.
>
> Each byte leaked from the seed reduces the search space by 8 bits. If the
> HSM is being used to generate 50 keys, the search space will be
> sufficiently reduced to make an attack possible.

Thank you for sharing this. But my concern is slightly different, considering
that FIPS 203 explicitly says the seed must come from an RBG and not from a
plain KDF which provides seed data for different ML-KEM keys.

My concern is more along the lines what FIPS 203 says: a fresh random number
is to be generated from an RBG each time a new key pair is created. However,
instead of wanting to store the key pair or the private key, storing the seed
is allowed. All I am asking is whether instead of storing 64 bytes from an
RBG, I can store 32 bytes from an RBG and expand the 32 bytes to 64 bytes
using a KDF to regenerate the one given key pair the seed applies to.

Ciao
Stephan


Phillip Hallam-Baker

unread,
Sep 6, 2024, 4:12:50 PMSep 6
to Stephan Mueller, Tony Arcieri, pqc-forum
Yes, and I just improved on my attack so I can do the same thing if there is a 32 byte key expanded to 64 bytes..!

Deirdre Connolly

unread,
Sep 6, 2024, 4:51:32 PMSep 6
to Stephan Mueller, Tony Arcieri, Phillip Hallam-Baker, pqc-forum
> I am asking is whether instead of storing 64 bytes from an
RBG, I can store 32 bytes from an RBG and expand the 32 bytes to 64 bytes
using a KDF to regenerate the one given key pair the seed applies to.

I would also really like this to be a supported/compliant way to store seeds and expand the keys from them.

--
You received this message because you are subscribed to the Google Groups "pqc-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pqc-forum+...@list.nist.gov.
To view this discussion on the web visit https://groups.google.com/a/list.nist.gov/d/msgid/pqc-forum/2098567.mTc0TySc1m%40tauon.atsec.com.

Blumenthal, Uri - 0553 - MITLL

unread,
Sep 6, 2024, 5:14:25 PMSep 6
to Deirdre Connolly, Stephan Mueller, Tony Arcieri, Phillip Hallam-Baker, pqc-forum
Count me in on this. 
Regards,
Uri

Secure Resilient Systems and Technologies
MIT Lincoln Laboratory

On Sep 6, 2024, at 16:51, Deirdre Connolly <durumcr...@gmail.com> wrote:


> I am asking is whether instead of storing 64 bytes from an RBG, I can store 32 bytes from an RBG and expand the 32 bytes to 64 bytes using a KDF to regenerate the one given key pair the seed applies to. I would also really like this to
ZjQcmQRYFpfptBannerStart
This Message Is From an External Sender
This message came from outside the Laboratory.
 
ZjQcmQRYFpfptBannerEnd

Tony Arcieri

unread,
Sep 6, 2024, 5:22:13 PMSep 6
to Deirdre Connolly, Stephan Mueller, Phillip Hallam-Baker, pqc-forum
On Fri, Sep 6, 2024 at 2:51 PM Deirdre Connolly <durumcr...@gmail.com> wrote:
I would also really like this to be a supported/compliant way to store seeds and expand the keys from them.

It seems like the one tricky detail here is the exact DRBG used to expand a "short seed".

--
Tony Arcieri

Deirdre Connolly

unread,
Sep 6, 2024, 5:27:21 PMSep 6
to Tony Arcieri, Stephan Mueller, Phillip Hallam-Baker, pqc-forum
KDF? 

Deirdre Connolly

unread,
Sep 6, 2024, 5:29:54 PMSep 6
to Tony Arcieri, Stephan Mueller, Phillip Hallam-Baker, pqc-forum

One of the SHAKEs is probably good

Orie Steele

unread,
Sep 6, 2024, 5:34:52 PMSep 6
to Deirdre Connolly, Tony Arcieri, Stephan Mueller, Phillip Hallam-Baker, pqc-forum
HKDF with sha256? 


Is there some known attack when d / z are expanded from each other? 

Is that what's being discussed here?

--
You received this message because you are subscribed to the Google Groups "pqc-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pqc-forum+...@list.nist.gov.

Phillip Hallam-Baker

unread,
Sep 6, 2024, 6:03:33 PMSep 6
to Tony Arcieri, Deirdre Connolly, Stephan Mueller, pqc-forum
I don't see that as being at all hard. It pretty much has to be SHAKE256 because that is what we are using in the rest of the code.

[

Phillip Hallam-Baker

unread,
Sep 6, 2024, 6:11:28 PMSep 6
to Orie Steele, Deirdre Connolly, Tony Arcieri, Stephan Mueller, pqc-forum
On Fri, Sep 6, 2024 at 5:34 PM Orie Steele <or...@transmute.industries> wrote:
HKDF with sha256? 


Is there some known attack when d / z are expanded from each other? 

Is that what's being discussed here?

Kind of. I started thinking up some kleptographic attacks and I could keep on doing that. But 'PHB can't think of an attack right now' is not exactly a great test. I am being a bit distracted by a series of indictments dropping on some very very bad people I played a part in identifying as such seven years ago.

We could argue about just how likely it is someone finds a way to manipulate the noise mask or we can just shut them down completely.

Stopping a key generator leaking internal state through the public key is much harder. But minimizing the number of degrees of frob is probably going to make attempts to partition the system so as to block the attacks a lot easier.


Of course, this whole class of attacks is one that I address for ECC algorithms in the Mesh through use of threshold key generation.It is quite easy to generate two separate private keys and combine them so that the resulting combination public key does not leak internal state. Which means two devices have to be compromized.

 

Phillip Hallam-Baker

unread,
Sep 10, 2024, 8:54:59 AMSep 10
to Tony Arcieri, pqc-forum, Stephan Mueller
I think we need to go for rigid construction of all private keys with a one way digest between a single input seed and the private key values. Not just for ML-KEM, ML-DSA and such, we should go back and fix RSA which has some really fun channels for kleptography.

I am not sure that allowing the use of two seeds instead of one gives an attacker sufficient leverage to make an attack feasible. But it certainly allows them to leak information from inside the HSM and if multiple public keys are being generated, they can leak 16 bits at a time without too much overhead.

Lets just stamp it out.



--
You received this message because you are subscribed to the Google Groups "pqc-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pqc-forum+...@list.nist.gov.

Phillip Hallam-Baker

unread,
Sep 10, 2024, 8:55:00 AMSep 10
to Tony Arcieri, pqc-forum, Stephan Mueller
OK so here is the attack I am considering:

1) Consider a sealed HSM that generates a series of public keypairs for different purposes. 

2) HSM has a 256 bit internal primary seed s which is used to generate seeds in combination with a counter sn = KDF (seed, count++)

3) HSM generates the private key first and then creates a series of public keys from random noise seeds until it finds one that matches the criteria that if you take an HMAC of the public key using a 'recovery secret' value r, the first 8 bits of the result match one byte in the seed s, and the next 8 are the byte 0_000xxxxx where xxxxx gives the location of the byte in the seed. This will typically take 65K attempts.

So the attacker simply watches and waits for public keys to be published. Each time a key is published, they first check to see if it has the signal flag saying it is a recovery secret and if so, they have a candidate byte in the seed.

Each byte leaked from the seed reduces the search space by 8 bits. If the HSM is being used to generate 50 keys, the search space will be sufficiently reduced to make an attack possible.


So generate the seeds using an extra SHAKE, its the obvious solution.

Tony Arcieri

unread,
Sep 10, 2024, 9:27:35 AMSep 10
to Stephan Mueller, pqc-forum
On Fri, Sep 6, 2024 at 2:10 PM Stephan Mueller <smue...@chronox.de> wrote:
But my concern is slightly different, considering
that FIPS 203 explicitly says the seed must come from an RBG and not from a
plain KDF which provides seed data for different ML-KEM keys.

My concern is more along the lines what FIPS 203 says: a fresh random number
is to be generated from an RBG each time a new key pair is created. However,
instead of wanting to store the key pair or the private key, storing the seed
is allowed. All I am asking is whether instead of storing 64 bytes from an
RBG, I can store 32 bytes from an RBG and expand the 32 bytes to 64 bytes
using a KDF to regenerate the one given key pair the seed applies to.

Can NIST offer any guidance here? Is there any approved construction for expanding a 32-byte "short seed" into 64-bytes? Particularly in a way where the 32-byte seed can be persisted and expanded as needed? Would this be against the usage requirements of a DRBG due to the need to persist the "short seed", and if so, what would be an approved construction to use?

--
Tony Arcieri

Tony Arcieri

unread,
Sep 13, 2024, 12:12:22 PMSep 13
to pqc-forum
I now see this was already proposed using SHAKE-256 to perform the expansion: https://groups.google.com/a/list.nist.gov/g/pqc-forum/c/5CT4NC_6zRI/m/lpifFrpWAwAJ
--
Tony Arcieri

Phillip Hallam-Baker

unread,
Sep 13, 2024, 3:25:19 PMSep 13
to Tony Arcieri, pqc-forum
I think it is important to have the domain separation as specified later in the thread. The way I would do this is to bind the OID of the algorithm ID into the hash which is consistent with approach elsewhere in ML-DSA.

So:

Seed = 32 bytes
d, k = SHAKE-256 ( oid + seed, 512 bits);

I believe that as a general matter, no cryptographic operation should use the output of any RNG without pre-processing it through a digest precisely to prevent a repeat of the DUAL-ECC RNG situation.



--
You received this message because you are subscribed to the Google Groups "pqc-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pqc-forum+...@list.nist.gov.

Alagic, Gorjan (Assoc)

unread,
Sep 17, 2024, 8:44:40 PMSep 17
to Phillip Hallam-Baker, Tony Arcieri, pqc-forum

Using only a 32-byte seed for ML-KEM is not allowed by FIPS 203, and we have no plans to allow it. It’s true that having such an option might be advantageous for some applications. However, we do not see this as a sufficient reason to change FIPS 203, either directly or indirectly. Making a change at this stage is no small task, and could have unintended consequences. – Gorjan (for NIST PQC team).

Filippo Valsorda

unread,
Sep 18, 2024, 12:42:44 PMSep 18
to Alagic, Gorjan (Assoc), pqc-forum
Thank you Gorjan for the reply.

Can you clarify if that's a general prohibition to use SP 800-108 KDFs to derive seeds, despite SP 800-133 and FIPS 140-3 IG D.H mentioning "asymmetric seeds"?

That would be especially unfortunate for hybrids, as it makes plenty of sense to derive both decapsulation keys from the same source key material in a rigid, well-specified, testable way, instead of storing them separately.

Filippo Valsorda

unread,
Sep 18, 2024, 12:46:31 PMSep 18
to Alagic, Gorjan (Assoc), pqc-forum
2024-09-18 18:42 GMT+02:00 Filippo Valsorda <fil...@ml.filippo.io>:
Can you clarify if that's a general prohibition to use SP 800-108 KDFs to derive seeds [...]

I should specify, to derive ML-KEM seeds. Sorry for the double-posting.

Blumenthal, Uri - 0553 - MITLL

unread,
Sep 18, 2024, 1:03:30 PMSep 18
to Alagic, Gorjan (Assoc), Phillip Hallam-Baker, Tony Arcieri, pqc-forum

ZjQcmQRYFpfptBannerEnd

Using only a 32-byte seed for ML-KEM is not allowed by FIPS 203, and we have no plans to allow it.

 

I suggest the NIST PQC team reconsiders.

 

It’s true that having such an option might be advantageous for some applications. However, we do not see this as a sufficient reason to change FIPS 203, either directly or indirectly.

 

We differ here.

 

Making a change at this stage is no small task, and could have unintended consequences. – Gorjan (for NIST PQC team).

 

Would you mind sharing your concerns? Being specific would be preferred.

 

Thanks

Tony Arcieri

unread,
Sep 18, 2024, 1:18:52 PMSep 18
to Alagic, Gorjan (Assoc), Phillip Hallam-Baker, pqc-forum
On Tue, Sep 17, 2024 at 6:44 PM Alagic, Gorjan (Assoc) <gorjan...@nist.gov> wrote:

Using only a 32-byte seed for ML-KEM is not allowed by FIPS 203, and we have no plans to allow it. 


Expanding a 32-byte seed into a 64-byte expanded private key is quite similar to the method described for EdDSA in FIPS 186-5 Appendix 2.3:

Process:
1. Obtain a string of b bits from an approved DRBG (as specified in SP 800-90A [16]) with a security strength of requested_security_strength or more. The private key d is this string of b bits.
2. Compute the hash of the private key d, H(d) = (h0, h1, ..., h2b-1) using SHA-512 for Ed25519 and SHAKE256 for Ed448 (H(d)= SHAKE256(d, 912)). H(d) may be precomputed. Note H(d) is also used in the EdDSA signature generation; see Section 7.6. FIPS 186-5 DIGITAL SIGNATURE STANDARD (DSS) 47 
3. The first half of H(d), (i.e. hdigest1 = (ℎ0, ℎ1, … , ℎ𝑏𝑏−1)) is used to generate the public key.

It seems perhaps this was "grandfathered in" as Ed25519 was already widely deployed in this manner, but I'm wondering if there are reasons other than that why this is allowed for EdDSA but not for ML-KEM.

--
Tony Arcieri
Reply all
Reply to author
Forward
0 new messages