ML-KEM / ML-DSA expanded key formats and trapdoors

878 views
Skip to first unread message

Samuel Lee

unread,
Mar 24, 2025, 7:08:25 PM3/24/25
to pqc-forum
Hey folks,
I was reasoning around malicious keypair generation for ML-DSA, and realized that there is a pretty trivial attack one can carry out to introduce a trapdoor from public keys to private keys for both ML-KEM and ML-DSA.
Suppose I am a malicious HSM vendor, perhaps with some government leaning on me to introduce a vulnerability.
I expose ML-KEM / ML-DSA interfaces. According to all of these interfaces my implementation is indistinguishable from being completely correct, but I have a trapdoor-ed key generation method.

I maintain an AES key, trapdoor_key, known only to me, encoded in my HSM. On key generation I tweak the methods (i.e. line 1 of Algorithm 13 of FIPS 203, line 1 of Algorithm 6 of FIPS 204) from the standards with the following:

evil_seed <- random( 24 )
# set the public rho value to the encryption of my evil seed with some marker values
ρ <- AES-ECB-ENC_trapdoor_key( "trap" || evil_seed[0..11] || "door" || evil_seed[12..23] )
# then derive the private key seeds (σ for ML-KEM; ρ' and K for ML-DSA)
# from evil_seed and do the rest of keygen as normal

A scheme like this will generate keypairs that are essentially indistinguishable from correctly generated keys (other than for the lack of a valid private seed), as 12 bytes of randomness in each 16-byte section of ρ is unlikely to ever collide, and the mathematical relationships between the regions of key material are maintained.
However, anyone with possession of trapdoor_key can decrypt observed public ρ values, and when they see the markers, they can derive the private key material from evil_seed for keypairs generated in their trapdoor-ed modules.
Obviously, one might go further and use asymmetric encryption to generate ρ value s.t. someone who reverse engineers my malicious HSM cannot acquire the trapdoor_key.

If a module allows export of the private key, then it becomes extremely suspicious if the module only allows export of the expanded private key format (as there is no way to verify correct keygen in another module).
If a module never allows export of the private key, then there is nothing you can do other than inspect the internals of the module and trust in your supply chain.

To me, this thought experiment makes the expanded private key formats very dubious to use in any situation, as once you export to this format there is no way to verify correct key generation. Even importing an expanded private key into a legitimate module makes that module suspicious as it cannot export the seed to indicate correct key generation.

Do folks agree that this is a bad (if not unique) property of the expanded private key formats?
Do we think there should be a stronger push for no support of expanded private key formats vs. private seeds given this attack model?

Best,
Sam

Simon Hoerder

unread,
Mar 25, 2025, 4:20:20 AM3/25/25
to Samuel Lee, pqc-forum
Hi,

I believe you have the potential for this issue in any kind of key that contains a uniform random string. How do you distinguish an 128-bit AES key from an 128-bit AES cipher text? 

Going a step further: What guarantee do you have that your RNG doesn’t contain covert channels — your supposed TRNG could, in reality, be a hidden DRBG exfiltrating its initial state and reseed values to allow recovery of every key and nonce you ever generate.

The defense against such attacks is not cryptographic but organizational. You use certification schemes, open source SW or both to ensure that hopefully independent people have looked over the code and checked for naughty bits. That gives you an initial basis of trustworthiness. You use device binding, key wrapping and certificates to ensure that all keys are bound to something trusted once you’ve got an initial trust basis. And the more valuable your secrets are, the more you invest into reviewing crypto implementations yourself.

Best,
Simon

On 25 Mar 2025, at 00:08, 'Samuel Lee' via pqc-forum <pqc-...@list.nist.gov> wrote:


--
You received this message because you are subscribed to the Google Groups "pqc-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pqc-forum+...@list.nist.gov.
To view this discussion visit https://groups.google.com/a/list.nist.gov/d/msgid/pqc-forum/783377e2-68d8-4668-9428-08cb14b47d84n%40list.nist.gov.

Samuel Lee (ENS/Crypto)

unread,
Mar 25, 2025, 12:48:50 PM3/25/25
to Simon Hoerder, pqc-forum
Totally agree that having a random free parameter which can be used as a covert channel is not a unique property of ML-KEM / ML-DSA public keys. And folks have demonstrated similar on RSA public moduli and ECDSA signature generation in the past. You do ultimately need to trust any code which generates private keys you will own.
I also found after posting this that there is a paper for malicious keygen in Kyber from 2022 with a different kleptographic construction: https://eprint.iacr.org/2022/1681.pdf


What ML-KEM / ML-DSA do have in their construction is an obvious way to prove that key generation was done according to the standard, and that is to export the initial private seed.
Given this private seed, an independent module can verify the public key is the same.

Hence the question about removing support for the expanded private key format. It seems like we are in a strictly better position w.r.t. malicious keygen if we drop supporting for expanded private key formats in favour of only private-seed formats.


A malicious module with broken TRNG for generating the private seed is still a risk, but the attack is weaker.
Specifically, I do not see a construction where an attacker can observe a single public key generated from their module in the wild and compute the private key from it (unless the module only generates a finite number of pre-computed keys, which should be visible in black-box testing).


FWIW, SLH-DSA key generation is strictly worse in this attack model as the public key seed is generated randomly and independently from the private key seed, so there is no way to prove you actually chose it randomly.
I wonder whether there would be appetite for deriving the public key seed from the private key seed in a modified externally verifiable key generation function.

Best,
Sam

From: Simon Hoerder <si...@hoerder.net>
Sent: 25 March 2025 01:19
To: Samuel Lee (ENS/Crypto) <Samue...@microsoft.com>
Cc: pqc-forum <pqc-...@list.nist.gov>
Subject: [EXTERNAL] Re: [pqc-forum] ML-KEM / ML-DSA expanded key formats and trapdoors
 
You don't often get email from si...@hoerder.net. Learn why this is important

Francisco

unread,
Mar 25, 2025, 1:26:57 PM3/25/25
to Samuel Lee (ENS/Crypto), Simon Hoerder, pqc-forum
If I'm reading this right, I don't see the need for dropping the expanded private key elements from the format. Ultimately, these are the values which identify a key at the lattice level and are used in module lattice equations. Sure one could argue that (𝜌, 𝜌′, 𝐾) = H(seed || ...) is also an equation of the cryptosystem, but it is not really related to module lattices.

For instance, what if in the future there is a need to update the transformation from seed to expanded key but keep the rest intact? Having the expanded format avoids backcompat issues and capture the fact that such change was not related to lattices at all.

--Francisco

Blumenthal, Uri - 0553 - MITLL

unread,
Mar 25, 2025, 2:03:05 PM3/25/25
to Francisco, Samuel Lee (ENS/Crypto), pqc-forum

ZjQcmQRYFpfptBannerEnd

If I'm reading this right, I don't see the need for dropping the expanded private key elements from the format.

 

I don’t see the need to keep in the format something that is unnecessary.

 

The simpler (and shorter/smaller) – the better.

Samuel Lee (ENS/Crypto)

unread,
Mar 25, 2025, 2:07:52 PM3/25/25
to Francisco, Simon Hoerder, pqc-forum
To be clear, the problem with the expanded private key formats for ML-KEM and ML-DSA is that they do not contain the private seed which was used to generate the expanded private key, making it impossible to verify that they were generated with the correct key generation process.

While one could consider introducing a new expanded private key format which also contains the private seed to enable external verification of the key generation process, now you have an even more redundant expanded key format, and I agree with Uri, the shorter/smaller the better.


It seems unlikely that we would want to update the private seed -> keypair generation process in future, but if we do need to do this there's no compatibility benefit to keeping around old expanded private keys (with an unverifiable import expanded private key API) vs. keeping around old private seeds (with an import private key from old seed API).

Sam


From: pqc-...@list.nist.gov <pqc-...@list.nist.gov> on behalf of Francisco <fran...@vialprado.com>
Sent: 25 March 2025 10:26

To: Samuel Lee (ENS/Crypto) <Samue...@microsoft.com>
Cc: Simon Hoerder <si...@hoerder.net>; pqc-forum <pqc-...@list.nist.gov>
Subject: Re: [EXTERNAL] Re: [pqc-forum] ML-KEM / ML-DSA expanded key formats and trapdoors
 
You don't often get email from fran...@vialprado.com. Learn why this is important
You received this message because you are subscribed to a topic in the Google Groups "pqc-forum" group.
To unsubscribe from this topic, visit https://groups.google.com/a/list.nist.gov/d/topic/pqc-forum/gaHWR-Fs3yQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to pqc-forum+...@list.nist.gov.
To view this discussion visit https://groups.google.com/a/list.nist.gov/d/msgid/pqc-forum/lZhvyMDnJmVKTFqWPuL0gDDkb0JsIHzIffmX7q0qsdK45R0xcG5WzdROC6CtuerxjXgdJFuXd6g3ydS2oIc3Fd2OU2mol9aYrRlOHcb2gIw%3D%40vialprado.com.

Orie Steele

unread,
Mar 25, 2025, 2:26:49 PM3/25/25
to Samuel Lee (ENS/Crypto), Francisco, Simon Hoerder, pqc-forum
I've not been able to follow the discussion of this in the IETF LAMPs WG, but my understanding is that they are preparing to enable private key expressions that contain the seed, the expanded key, or both.

I've prepared a similar PR for the COSE and JOSE key representations for ML-DSA:

https://github.com/cose-wg/draft-ietf-cose-dilithium/pull/16

I added some language about validation, which we will probably tune when the LAMPs draft is updated.

In particular, I added a comment about it being not recommended to have multiple private information class parameters for a given algorithm. 

Should I add a comment that says when an expanded private key is present, seed is recommended to be present, so that the expanded key can be confirmed?

Regards,

OS

D. J. Bernstein

unread,
Mar 25, 2025, 4:29:47 PM3/25/25
to pqc-...@list.nist.gov
'Samuel Lee (ENS/Crypto)' via pqc-forum writes:
> While one could consider introducing a new expanded private key format
> which also contains the private seed to enable external verification
> of the key generation process, now you have an even more redundant
> expanded key format, and I agree with Uri, the shorter/smaller the
> better.

Let's call that format seed+expanded. The three choices compared here
are (1) seed+expanded; (2) expanded; (3) seed.

Quantitatively, post-quantum systems typically have kilobytes of data in
expanded private keys. It's plausible that some applications will care
about an improvement from kilobytes for expanded to 32 bytes for seed,
but it's not plausible that applications will care about the difference
between expanded and seed+expanded, adding 32 bytes on top of kilobytes.

Regarding "new": The expanded format was used consistently in Kyber v1,
Kyber v2, Kyber v3, the ML-KEM standard, the reference Kyber code, and
further Kyber/ML-KEM implementations derived from the reference code; so
there's certainly an ecosystem-complexity argument against deviating
from that format. But it's important to note that the same argument says

* "don't add seed+expanded as another ML-KEM format" _and_
* "don't add seed as another ML-KEM format".

I remain baffled by the recent push to add formats to the ML-KEM
ecosystem. From a software perspective, that's a self-inflicted wound,
so there would have to be a really good reason to do it---e.g., clear
evidence of applications that really need the smallness of a seed.

Anyway, the ease-of-testing argument favors seed+expanded over expanded,
and over seed. Compressibility also favors seed+expanded over expanded,
but one should think of compressibility as a "just in case we really
need it" benefit, since actually supporting compression makes testing
more difficult. See also

https://groups.google.com/a/list.nist.gov/g/pqc-forum/c/lMTIIPu9yDY/m/BeQXJi2qAAAJ

for further software-security advantages of expanded and seed+expanded
compared to seed.

As a data point illustrating the tradeoffs, Classic McEliece switched
its spec and all software from expanded to seed+expanded five years ago.
The engineering cost of switching formats was certainly a consideration,
but the benefits were clear and a complete switch was feasible. Today
https://mceliece.org lists many more implementations and applications of
Classic McEliece; this expansion of the ecosystem means that there are
correspondingly high benefits to stability of every detail, including
the private-key format.

---D. J. Bernstein
signature.asc

Samuel Lee (ENS/Crypto)

unread,
Mar 25, 2025, 5:29:37 PM3/25/25
to D. J. Bernstein, pqc-...@list.nist.gov
If we're talking about specifying a new expanded ML-KEM / ML-DSA private key format which is seed+expanded, then we should expand the public matrix A in that format too!
I would not be opposed to a seed+expanded format which is useful, but the existing expanded format is not expanded in a way that makes it significantly cheaper to import. And I would suggest that most of time at import, you would want to check the seed+expanded key is consistent anyway, so I do not see there is a lot to gain.

So, IMO, the externally visible private key format for exchange between modules may as well be small and not have redundancy (i.e. just seed). Similarly, we do not standardize the way to serialize AES key schedules between modules.
Implementations that want to serialize expanded keys for their own operation and reuse can make module-specific tradeoffs for how much to expand or not.

Best,
Sam
________________________________________
From: pqc-...@list.nist.gov on behalf of D. J. Bernstein
Sent: Tuesday, March 25, 2025 13:29
To: pqc-...@list.nist.gov


Subject: Re: [EXTERNAL] Re: [pqc-forum] ML-KEM / ML-DSA expanded key formats and trapdoors

'Samuel Lee (ENS/Crypto)' via pqc-forum writes:

https://groups.google.com/a/list.nist.gov/g/pqc-forum/c/lMTIIPu9yDY/m/BeQXJi2qAAAJ

---D. J. Bernstein

--


You received this message because you are subscribed to a topic in the Google Groups "pqc-forum" group.
To unsubscribe from this topic, visit https://groups.google.com/a/list.nist.gov/d/topic/pqc-forum/gaHWR-Fs3yQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to pqc-forum+...@list.nist.gov.

To view this discussion visit https://groups.google.com/a/list.nist.gov/d/msgid/pqc-forum/20250325202925.1483071.qmail%40cr.yp.to.

bruno

unread,
Mar 31, 2025, 9:27:08 AM3/31/25
to Simon Hoerder, Samuel Lee, pqc-forum

Hi, I think agree that this is a new threat vector and also think that we rely on certification and the source code. I think this is also valid for HSM in particular ML-DSA RoT.

It is likely also to be a supply chain attack risk discussion. 

May i add to that point that there is a concern on the diversity of HSM in the cloud since it looks like one actor is doing most cryptography operation (namely Cavium/Marvell)? 

I apologize to share one of my old concerns but may be this could be relevant? and i did not see any controls to mitigate that.

I am happy to be wrong.

Thanks

[1] https://www.marvell.com/company/newsroom/marvell-enables-enterprise-data-center-and-private-cloud-security-with-innovative-liquidsecurity-network-hsm.html

[2] https://cloud.google.com/kms/docs/attest-key

[3] https://docs.aws.amazon.com/cloudhsm/latest/userguide/cloudhsm_mgmt_util-getHSMInfo.html

[4] https://docs.microsoft.com/en-us/azure/key-vault/managed-hsm/overview

[5] https://futurumresearch.com/research-notes/marvell-industry-analyst-day-2020-marvell-is-securely-sailing-cloud-high-on-liquidsecurity-portfolio/

[6] https://www.marvell.com/company/newsroom/microsoft-integrates-marvel-nist-fips-140-3-level-3-compliant-liquidsecurity-hsms.html?utm_source=li&utm_medium=pr&utm_campaign=microsoft-ls2

Reply all
Reply to author
Forward
0 new messages