Input Checking in PQC

199 views
Skip to first unread message

-

unread,
Dec 8, 2025, 11:40:54 AM (9 days ago) Dec 8
to pqc-...@list.nist.gov

Dear all,

I would like to request clarification regarding the role and necessity of input checking in the recently published PQC standards.

Why is input checking considered so important? More specifically, in the absence of such checks, how can “strong unforgeability” be broken? I am also curious whether the similar reasoning applies to ML-KEM. As far as I know, the submissions did not explicitly include such requirements (even though they claimed strong properties such as SUF-CMA and IND-CCA); these requirements appear to have been added later. In addition, I could not find input-checking requirements in the ISO standard for FrodoKEM. Is input checking considered a relative or implementation-dependent requirement, or is it in fact a strict necessity?

For reference, FIPS 204 states:

"3.6.2 Public-Key and Signature Length Checks
Algorithm 3, implementing verification for ML-DSA, and Algorithm 5, implementing verification for HashMLDSA, specify the length of the signature 𝜎 and the public key 𝑝𝑘 in terms of the parameters described in
Table 1. If an implementation of ML-DSA can accept inputs for 𝜎 or 𝑝𝑘 of any other length, it shall return
false whenever the lengths of either of these inputs differ from their lengths specified in this standard.
Failing to check the length of 𝑝𝑘 or 𝜎 may interfere with the security properties that ML-DSA is designed
to have, like strong unforgeability."

FIPS 203 states that:

"Input checking. The algorithms ML-KEM.Encaps and ML-KEM.Decaps require input checking.
Implementers shall ensure that ML-KEM.Encaps and ML-KEM.Decaps are only executed on
inputs that have been checked, as described in Section 7"

NIST.SP.800-227 says that:

"Input checking. The correct and secure operation of cryptographic operations depends
crucially on the validity of the provided inputs. Even relatively benign faults, such as accepting an input that is too long or too short, can have serious security consequences.
KEM implementations need to perform input checking in an appropriate manner for all
KEM algorithms (i.e., KeyGen, Encaps, and Decaps). The exact form of the required input
checking is described in the FIPS or SP that specifies the relevant KEM."


Best.

Sophie Schmieg

unread,
Dec 8, 2025, 2:54:28 PM (8 days ago) Dec 8
to -, pqc-...@list.nist.gov
Most of the input checks are more necessary for interoperability than security. For ML-KEM, the biggest question, other than the size of inputs, is how to handle integers out of range (i.e. >= 3329). There are three options an implementation can chose:
  • Reduce them mod 3329, quietly correct the public key for hashing purposes
  • Reduce them mod 3329, but do not change the public key bytes for hashing purposes
  • Fail
While all three options have pros and cons, they are relatively minor, with having a consistent behavior across implementations is much more important than any of the pros and cons I'm aware of.
For ML-DSA, the public key, due to compression, is always in the correct range, and the same goes for most components of the signature, so you don't quite have the same problem, save for length or invalid hints. But you can very easily violate strong unforgability if you allow for signatures that are too long, assuming that an implementation just ignores the superfluous bytes: Given a valid signature s, s || r, with r  arbitrary, would be another valid signature of the same message, violating strong unforgability.

For both ML-KEM and ML-DSA, inputs that are too short being accepted are ill-defined, with the rest of the bytes potentially being random memory regions, which could leak via the implementation. If the missing bytes are instead filled with zeroes, you could get other issues. In particular, for both ML-KEM and ML-DSA, the public key consists of a matrix (A, given in form of a seed) and a vector (t), which relate to each other as t = As + e. If the public key only has the seed of the matrix, but substitutes zero for the vector t, you would know that 0, 0 is a valid choice for s and e, as 0 is usually considered fairly small :). I'm not aware of any issues with ML-KEM with inputs that are too long, except that a bunch of binding properties go out of the window.


--
You received this message because you are subscribed to the Google Groups "pqc-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pqc-forum+...@list.nist.gov.
To view this discussion visit https://groups.google.com/a/list.nist.gov/d/msgid/pqc-forum/CAJ92McPbfOw-bYmJbwfAjba_J70Lqwh9K9XinGhCy7028mBVow%40mail.gmail.com.


--

Sophie Schmieg |
 Information Security Engineer | ISE Crypto | ssch...@google.com

Reply all
Reply to author
Forward
0 new messages