Edge-case test vectors for ML-DSA / Dilithium

900 views
Skip to first unread message

Guillaume Endignoux

unread,
Apr 5, 2024, 11:35:31 AM4/5/24
to pqc-forum
Dear forum,

I have published test vectors that exercise edge cases in ML-DSA signatures, available at: https://github.com/C2SP/wycheproof/pull/112

These tests cover both the round 3 proposal of Dilithium and the current draft of the FIPS 204 standard, and aim to cover the following cases:

- Baseline (signing a "Hello world" message).
- Keys and signatures of the wrong length.
- Signatures with bit flips.
- Signature hints with a non-canonical encoding:
  - hints that aren't sorted,
  - hints are not strictly sorted (i.e. the same hint is repeated),
  - too many hints (potentially causing a buffer overflow),
  - non-zero padding after the hints.
- Secret keys with the `s1` or `s2` vectors out of the `[-eta, eta]` range.
- Public key with the `t1` component set to zero (allowing trivial forgeries, but the verification algorithm should still accept signatures for this key).
- Boundary conditions in the signature rejection loop (aiming to detect incorrect comparisons):
  - `z_max` equals `gamma1 - beta - 1` or `gamma1 - beta`,
  - `r0_max` equals `gamma2 - beta - 1` or `gamma2 - beta`,
  - `h_ones` equals `omega` or `omega + 1`,
  - in the case of ML-DSA-44 (a.k.a. Dilithium 2), `|ct0|_max` equals `gamma2 - 1` or `gamma2` (*).
- A "large" number of SHAKE bytes and of SHAKE blocks generated in the `expand_a`, `expand_s`, `rej_ntt_poly` and `rej_bounded_poly` functions.
- Boundary conditions in arithmetic functions:
  - `power_2_round` function: when the remainder (found in `t0`) is equal to 4096 or -4095,
  - `decompose` (via `high_bits` or `low_bits`): when the condition `r_plus - r_0 = q - 1` happens.

(*) For the `|ct0|_max` condition, because there is a very low probability of obtaining a signature that exercises it by brute-force (see g/pqc-forum/c/eLaw4fgjaMg and g/pqc-forum/c/G8Zf0hC-uu0), I have used the following method, "cheating" on the key generation procedure to craft test vectors that pass the condition.

1. Starting with all-zeroes `s1` and `s2` vectors and an arbitrary matrix A, generate random changes in `s1[0]` so that the resulting `t0[0]` has a maximal L2 norm. Note that in comparison to g/pqc-forum/c/G8Zf0hC-uu0, I used the L2 norm rather than the L1 norm, because (central limit theorem) when tau is large the coefficients of `ct0` tend to follow a normal distribution which only depends on the mean (here zero) and the variance (i.e. L2 norm).

After that `s1 = [x, 0, 0, 0]`, `s2 = [0, 0, 0, 0]` and `t0 = [a, b, c, d]` where `a` has a large L2 norm. I repeated this procedure on 50 matrices and kept the best one (largest `L2(t0[0])`) to increase the probability of success.

2. Sign messages in a loop until we find one where `|ct0|_max` is close to `gamma2` and adjust `s2` to obtain the desired value of `|ct0|_max`. More precisely:
- Fail if `|ct0|_max` isn't between `gamma2 - tau*eta` and `gamma2 + tau*eta - 1`.
- In case of success, adjust the coefficients of `s2[0]` to obtain the desired `|ct0|_max` (both `gamma2 - 1` and `gamma2`). This works in the vast majority of cases because `c` contains `tau` ones / minus ones and `s2` can have coefficients within `[-eta, eta]`.
- Finish the signing procedure, accepting the message only if the other conditions succeed (on `z_max`, `r0_max` and `h_ones`), so that the test is useful.

Choosing `s2` after the fact works because it isn't a dependency of the public key (as long as the rounded value `t1` stays the same). Of course the generated key pair isn't honest as there is no known random seed to generate it, but it works for the purpose of functionally testing the signing procedure, as the keys are serialized correctly.

With all that, I needed to generate in the order of O(10^7) messages to find a match, or about 20 minutes of CPU time with some AVX2 optimizations (just a few minutes with several threads).
The main speedup compared to g/pqc-forum/c/G8Zf0hC-uu0 is that `|ct0|_max` doesn't need to fall exactly on the desired value, because `s2` is adjusted after the fact. With ML-DSA-44 parameters where tau=39 and eta=2, this gives a 155x speed-up. Some computations are also saved by not running the full signing procedure on each message (but a simplified one to test |ct0|_max upfront).

Best,
Guillaume

Guillaume Endignoux

unread,
Aug 14, 2024, 10:54:24 AM8/14/24
to pqc-forum, Guillaume Endignoux
I've created updated tests based on my reading of the changes advertised in Appendix D of https://nvlpubs.nist.gov/nistpubs/fips/nist.fips.204.pdf.
They should cover the "external" APIs, with the context parameter and without pre-hashing.

Of course, take these with a grain of salt given that as far as I'm aware no reference source code nor official test vectors have been published yet for the final standard.
Reply all
Reply to author
Forward
0 new messages