CryptAttackTester numbers for BIKE and HQC

1,049 views
Skip to first unread message

D. J. Bernstein

unread,
Sep 10, 2024, 1:47:06 AM9/10/24
to pqc-...@list.nist.gov
We have run CAT to obtain high-assurance predictions for the cost of
various non-quasi-cyclic ISD attacks against the following parameters:

* (24646,134), as in attacking BIKE-1 messages.
* (24646,142), as in attacking BIKE-1 keys.
* (35338,132), as in attacking HQC128.

In particular, CAT predicts the following costs (and CAT's search for
attack parameters did not find any lower costs for these attacks), all
measured in CAT's fully defined model of bit operations:

* (24646,134): 2^153.46 for isd2 with p'=2, p''=1, C=0.
* (24646,142): 2^161.22 for isd2 with p'=2, p''=1, C=0.
* (35338,132): 2^152.39 for isd2 with p'=2, p''=1, C=1.

These numbers differ somewhat from the estimates in previous literature.
The rest of this message analyzes the differences and explains how to
reproduce CAT's numbers.


1. Comparison to previous estimates

Qualitatively, one expects CAT's ISD numbers to be different from
pre-CAT estimates for the following reasons:

* CAT's numbers account for all steps in ISD, for all bit operations
in each step, and for various probability-analysis issues that are
usually neglected. In general, pre-CAT estimates should simply be
discarded once numbers from CAT are available for the same attacks.

* As noted above, CAT's numbers are for attacks without quasi-cyclic
speedups. The literature estimates that those speedups save nearly
14 bits for BIKE-1 key recovery, and between 6 and 9 bits in the
other cases.

It would be interesting to include quasi-cyclic speedups in CAT,
obtaining high assurance for the result of those speedups. One source of
quasi-cyclic speedups is having multiple targets, but it would also be
interesting to investigate speedups from replacing matrix operations
with polynomial operations.

_If_ the actual quasi-cyclic speedups match the literature's estimates
then the resulting attacks against BIKE-1 and HQC128 are

* more expensive than AES-128 brute-force search, but
* less expensive than 2^150 bit operations, the cost of attacks
against mceliece348864 messages.

This last conclusion is contrary to claims in papers at PKC 2022 and
Eurocrypt 2022, namely that BIKE-1 and HQC128 had >10 bits of extra
security compared to mceliece348864 when quasi-cyclic speedups were
ignored (e.g., 2^153 for BIKE-1 vs. 2^141), and still 4 or 5 bits of
extra security considering quasi-cyclic speedups (e.g., 2^146 for BIKE-1
vs. 2^141), also affecting precise comparisons to AES. Here are some
reasons for discrepancies:

* Those papers underestimated the number of bit operations used in
sorting. (See Appendix J.9 of the CAT paper for details.) This
tilted the comparisons between ISD and AES.

* Those papers missed various linear-algebra speedups that CAT
includes. This tilted the ISD comparisons between different
parameter sets: the linear-algebra speedups have more impact when
p' is smaller and n is larger, the situation for BIKE and HQC.

The PKC 2022 paper said that "to keep the analysis and formulas
comprehensible" it avoids optimizing linear algebra. The paper did not
observe that this omission tilts the resulting comparisons; the paper
even seems to claim the contrary ("for more recent ISD algorithms the
complexity of the Gaussian elimination procedure does not dominate").
The Eurocrypt 2022 paper similarly claimed that "modern ISD" was
"enumeration-dominated" rather than "permutation-dominated".


2. Reproducing the CAT numbers for specific attack parameters

Here is how to reproduce the numbers 2^153.46, 2^161.22, 2^152.39 above
for specific attack parameters. First download and unpack CAT:

wget https://cat.cr.yp.to/cryptattacktester-20231020.tar.gz
tar -xf cryptattacktester-20231020.tar.gz
cd cryptattacktester-20231020

Then edit uniformmatrix.cpp as follows:

* After the line saying "while (q < N) { ++lgq; q *= 2; }", add
"bigint K = N/2; if (!selection_allows(S,"K",K.get_str())) continue;"

* Remove four lines from "bigint K = N-lgq*W;" through
"if (!selection_allows(S,"K",K.get_str())) continue;".

Then compile:

make -j8

Then run the following (this takes a few minutes):

./predictedcp N=24646 W=134 \
attack=isd2 PI=2 PIJ=1 CP=1 CS=0 FW=1 \
I=137438953472 RE=2097152 X=96 YX=44 L0=10 L1=34 \
D=90 Z=0 QU0=7 QF0=4 WI0=14 QU1=1 QF1=131072 WI1=1

./predictedcp N=24646 W=142 \
attack=isd2 PI=2 PIJ=1 CP=1 CS=0 FW=1 \
I=137438953472 RE=2097152 X=96 YX=44 L0=10 L1=34 \
D=90 Z=0 QU0=7 QF0=4 WI0=14 QU1=1 QF1=131072 WI1=1

./predictedcp N=35338 W=132 \
attack=isd2 PI=2 PIJ=1 CP=0 CS=1 FW=1 \
I=274877906944 RE=4194304 X=128 YX=46 L0=10 L1=36 \
D=85 Z=0 QU0=8 QF0=4 WI0=19 QU1=1 QF1=294912 WI1=1

The output is as follows:

predictedcp problem=uniformmatrix N=24646,K=12323,W=134 attack=isd2
I=137438953472,RE=2097152,X=96,YX=44,PIJ=1,PI=2,L0=10,L1=34,CP=1,CS=0,D=90,Z=0,QU0=7,QF0=4,WI0=14,QU1=1,QF1=131072,WI1=1,FW=1
lgratio [153.459967,153.459968]
lgratio2 [153.459967,153.459968]
cost 52628388475214842007709 lgcost [75.4782593,75.4782594]
prob [3.35093939e-24,3.3509394e-24] lgprob [-77.9817087,-77.9817086]
prob2 [3.35093939e-24,3.3509394e-24] lgprob2 [-77.9817087,-77.9817086]

predictedcp problem=uniformmatrix N=24646,K=12323,W=142 attack=isd2
I=137438953472,RE=2097152,X=96,YX=44,PIJ=1,PI=2,L0=10,L1=34,CP=1,CS=0,D=90,Z=0,QU0=7,QF0=4,WI0=14,QU1=1,QF1=131072,WI1=1,FW=1
lgratio [161.216353,161.216354]
lgratio2 [161.216353,161.216354]
cost 52628388475214842007709 lgcost [75.4782593,75.4782594]
prob [1.54975063e-26,1.54975064e-26] lgprob [-85.7380944,-85.7380943]
prob2 [1.54975063e-26,1.54975064e-26] lgprob2 [-85.7380944,-85.7380943]

predictedcp problem=uniformmatrix N=35338,K=17669,W=132 attack=isd2
I=274877906944,RE=4194304,X=128,YX=46,PIJ=1,PI=2,L0=10,L1=36,CP=0,CS=1,D=85,Z=0,QU0=8,QF0=4,WI0=19,QU1=1,QF1=294912,WI1=1,FW=1
lgratio [152.391459,152.39146]
lgratio2 [152.391459,152.39146]
cost 222803032906266248103186 lgcost [77.560115,77.5601151]
prob [2.97523076e-23,2.97523077e-23] lgprob [-74.8313447,-74.8313446]
prob2 [2.97523076e-23,2.97523077e-23] lgprob2 [-74.8313447,-74.8313446]

The "lgratio" entries show CAT's prediction for log_2(cost/probability).


3. Reproducing the CAT search for attack parameters

To reproduce the search that found the attack parameters shown above,
start with the same steps to download CAT, edit uniformmatrix.cpp, etc.
Then edit the list of problems and attacks in isdpredict1.py as follows:

problems = (
'N=24646,W=134',
'N=24646,W=142',
'N=35338,W=132',
)

attacks = (
'attack=isd0,P=0,L=0,FW=1',
'attack=isd0,P=1,L=0,FW=1',
'attack=isd0,P=2,L=0,FW=1',
'attack=isd0,P=3,L=0,FW=1',
'attack=isd0,P=4,L=0,FW=1',
'attack=isd0,P=1,FW=1',
'attack=isd0,P=2,FW=1',
'attack=isd0,P=3,FW=1',
'attack=isd0,P=4,FW=1',
'attack=isd1,PI=1,FW=1',
'attack=isd1,PI=2,FW=1',
'attack=isd1,PI=3,FW=1',
'attack=isd1,PI=4,FW=1',
'attack=isd1,PI=5,FW=1',
'attack=isd1,PI=6,FW=1',
'attack=isd2,PI=2,PIJ=1,CP=1,CS=0,FW=1',
'attack=isd2,PI=4,PIJ=2,CP=1,CS=0,FW=1',
'attack=isd2,PI=6,PIJ=3,CP=1,CS=0,FW=1',
'attack=isd2,PI=8,PIJ=4,CP=1,CS=0,FW=1',
'attack=isd2,PI=10,PIJ=5,CP=1,CS=0,FW=1',
'attack=isd2,PI=12,PIJ=6,CP=1,CS=0,FW=1',
'attack=isd2,PI=2,PIJ=2,CP=1,CS=0,FW=1',
'attack=isd2,PI=4,PIJ=3,CP=1,CS=0,FW=1',
'attack=isd2,PI=6,PIJ=4,CP=1,CS=0,FW=1',
'attack=isd2,PI=8,PIJ=5,CP=1,CS=0,FW=1',
'attack=isd2,PI=10,PIJ=6,CP=1,CS=0,FW=1',
'attack=isd2,PI=2,PIJ=1,CP=0,CS=1,FW=1',
'attack=isd2,PI=4,PIJ=2,CP=0,CS=1,FW=1',
'attack=isd2,PI=6,PIJ=3,CP=0,CS=1,FW=1',
'attack=isd2,PI=8,PIJ=4,CP=0,CS=1,FW=1',
'attack=isd2,PI=10,PIJ=5,CP=0,CS=1,FW=1',
'attack=isd2,PI=12,PIJ=6,CP=0,CS=1,FW=1',
)

Then run the parameter search (this took 2487 minutes real time on a
dual EPYC 7742 with overclocking disabled):

./isdpredict1.py > isdpredict1.out
./isdpredict2.py < isdpredict1.out > isdpredict2.out
./isdpredict-table.py < isdpredict2.out > isdpredict.tex

Copies of the output files are available here:

https://cat.cr.yp.to/isdpredict1-bikehqc-20240908.out.gz
https://cat.cr.yp.to/isdpredict2-bikehqc-20240908.out.gz
https://cat.cr.yp.to/isdpredict-bikehqc-20240908.tex

The lowest costs found are marked in blue in isdpredict.tex, and full
details of the attack parameters are in isdpredict2.out.

---D. J. Bernstein (on behalf of the CryptAttackTester team)
signature.asc

Andre

unread,
Sep 16, 2024, 1:29:50 AM9/16/24
to pqc-forum, D. J. Bernstein, pqc-...@list.nist.gov
Dear Dan and CryptAttackTester team,

Thank you very much for your efforts in designing this new tool and sharing those numbers.

It is great to see that such tremendous efforts are put into increasing our confidence in the
security estimates for those potential code-based PQC standards.

# Confirmation of Bit Security
The numbers you shared are a testament of our well founded security understanding for BIKE and HQC 
as they closely match the numbers obtained in the mentioned PKC22 and EC22 papers and also coincide 
with the numbers of a more recent EC23 paper on the subject. The results of all those papers are integrated 
in the CryptographicEstimators (CE) library whose output for the mentioned parameter sets (n, k, w) is

* (24646, 13232, 134): 2^152.9 (CE) vs 2^153.46 (CAT)
* (24646, 13232, 142): 2^160.4 (CE) vs 2^161.22 (CAT)
* (35338, 17669, 132): 2^152.4 (CE) vs 2^152.39 (CAT)
(Those CE numbers disregard cyclicity speedups similar to CAT for comparability)

Thus, even though there are subtle differences between the tools with respect to cost models, fine
tuning and exclusion of techniques, the end result of both tools is the same (up to a tolerance of
less than one bit). This is great news, and for the BIKE and HQC submissions this means that
everything is still well on track.

# Mentioned Discrepancy
Since the numbers for BIKE and HQC closely match between CE and CAT the mentioned discrepancy 
of 10 bits between McEliece and BIKE / HQC (without cyclicity speedups) observed by previous estimates, 
but not by CAT, comes from a difference of the *McEliece estimate* between CE and CAT. This is explained 
by the fact that the CAT cost model is penalizing memory intensive algorithms, while the RAM-model used 
by CE is not. Therefore for McEliece parameters the optimal algorithms found by CE and CAT are different 
(contrary to BIKE / HQC instances where both tools find the same algorithms as optimal, mainly because 
of the small error weight used by those schemes). If on the other hand we restrict the CE to a version of the 
algorithm that CAT finds optimal the estimates coincide:

* (3488, 2720, 64): 2^140.8 (CE, unrestricted) vs 2^150.59 (CAT)
* (3488, 2720, 64): 2^151.4 (CE, restricted to Stern) vs 2^150.59 (CAT)

Thus, the discrepancy is not related to the estimates of BIKE and HQC, but rather to those of
McEliece. It boils down to the controversial discussion which algorithms are optimal for McEliece
parameters. Alternatively the CryptographicEstimators also allow to introduce a memory penalty,
when introducing such a penalty the numbers coincide even without an algorithm restriction:

* (3488, 2720, 64): 2^150.2 (CE, cube-root memory penalty) vs 2^150.59 (CAT)

# Regeneration of Numbers
The numbers shared above (labeled CE) can be regenerated using the web-interface or by installing the 
CryptographicEstimators library locally.

*Web-interface
The web-interface is available at https://estimators.crypto.tii.ae/configuration?id=SDEstimator .
For re-computation of the numbers for a given parameter set, provide n, k and w (e.g. n=3488, k=2720, w=64
in the McEliece case) in the respective input fields. Additionally, you might want to set the "Decimal precision",
which is located under the collapsable "Estimators parameters" to "1". Then click on the "ESTIMATE" button
to launch the computation. Once finished a table with the results will be displayed. To regenerate the result
using the memory penalty set the "Memory access cost" also found under the "Estimators parameters" to
"Cube root" and then click again on "ESTIMATE".

*Local computation
To install the library locally or to use the prepared docker container you can follow the information
provided at https://github.com/Crypto-TII/CryptographicEstimators .Once installed the following
commands can be run in the sage / python environment to recompute the above numbers

from cryptographic_estimators.SDEstimator import SDEstimator
SDEstimator(24646, 13232, 134).table()
SDEstimator(24646, 13232, 142).table()
SDEstimator(35338, 17669, 132).table()
SDEstimator(3488, 2720, 64).table()
SDEstimator(3488, 2720, 64, memory_access=3).table()

The computations take approximately 10-20 minutes in the web-interface or when executed locally on
a standard laptop.

Best regards,
Andre (on behalf of the CryptographicEstimators team)

D. J. Bernstein

unread,
Sep 16, 2024, 11:41:48 AM9/16/24
to pqc-...@list.nist.gov
Andre writes:
> Thus, even though there are subtle differences between the tools with
> respect to cost models, fine tuning and exclusion of techniques, the
> end result of both tools is the same (up to a tolerance of less than
> one bit).

The bottom-line numbers are within 1 bit for these cases, but that's
because the pre-CAT estimates have two larger errors that in these cases
happen to nearly cancel (6 bits one way, 7 bits the other; see below).
Other cases sometimes have one error being much larger than the other
(see example below).

This isn't a subtle difference. The pre-CAT estimates should be thrown
away. Confidence in CAT's numbers comes from the assurance mechanisms
built into CAT (especially the testing of complete algorithms against
predictions in the same cost metric), not from the accidental alignment
that occurs in some cases between CAT's numbers and pre-CAT estimates.

Here are the two errors mentioned above:

(1) These pre-CAT estimates miss speedups in linear algebra, even
skipping speedups explained in earlier literature.

(2) These pre-CAT estimates count the number of inputs and outputs
for collision-finding rather than counting the actual gates used
for that computation.

#1 is an algorithm-optimization error that increases estimates. #2 is an
algorithm-analysis error that decreases estimates.

To disentangle the effects of #1 and #2, it helps to carry out not just
a correct analysis of a fully defined algorithm designed to minimize
costs, but also a correct analysis of a fully defined algorithm that
_doesn't_ optimize linear algebra. Both parts of this are already done
by CAT's ISD-parameter-search script.

Example: For (24646,12323,134), the aforementioned 2^153.46 is the
result of the script looking for attack parameters to minimize costs,
but the script also prints the result of leaving out various
linear-algebra speedups, namely 2^159.51. This is a 6-bit difference,
contradicting the suggestions at PKC 2022 and Eurocrypt 2022 that linear
algebra doesn't matter.

To see these 2^153.46 and 2^159.51 numbers and to see the attack
parameters for double-checks, first download the file

https://cat.cr.yp.to/isdpredict2-bikehqc-20240908.out.gz

(or reproduce that from CAT as explained earlier) and look for the line
"lgratio [153.459967,153.459968]" with the following attack parameters:

I=137438953472,RE=2097152,X=96,YX=44,PIJ=1,PI=2,L0=10,L1=34,CP=1,CS=0,D=90,Z=0,QU0=7,QF0=4,WI0=14,QU1=1,QF1=131072,WI1=1,FW=1

Then look for a line "lgratio [159.507991,159.507992]" differing only in
having "I=65536,RE=1,X=1,YX=1".

As one can check from CAT's parameter definitions for this attack, the
RE=1 leading to 2^159.51 says that there's a full new matrix reduction
in every iteration. The 2^153.46 instead uses random walks (larger RE)
and, on each step, works with limited portions of the matrix (specified
by X and YX). See the CAT paper for more information.

How should one compare these CAT results to the pre-CAT estimate of
2^152.9? The pre-CAT estimate isn't coming from a fully defined
algorithm, but it's assuming a full matrix reduction in each iteration
(adding "_gaussian_elimination_complexity" for each iteration), so it
should be compared to CAT's 2^159.51. The gap between 2^152.9 and
2^159.51 comes from error #2.

Overall this means that the 2^152.9 is a 7-bit underestimate (error #2)
of the cost of an algorithm that's missing 6 bits of speedups (error
#1). Correcting _both_ errors gives CAT's 2^153.46. Yes, that's within 1
bit of 2^152.9, but it's not that these are very close analyses of very
close algorithms.

Let's look more closely at the algorithm analysis. CAT fully defines all
algorithms that it considers---whether that's a simple brute-force
search against AES-128, linear algebra in ISD, or the rest of ISD---all
the way down through a simple set of constant-size gates (in short: cost
1 for any bit operation on <=2 input bits). Any way to improve the
attacks to use fewer of these gates can be expressed and tested in the
CAT framework. CAT tests cost predictions against attack simulations for
a range of input sizes, so it'll catch a typical "oops, forgot to count
that step!" algorithm-analysis error, whereas pre-CAT estimates won't.

Until we have quantum computers, all of our computations are in fact
done with bit operations. This includes abstractions such as "RAM": a
real RAM bank is implemented on top of many physical gates, all of which
are occupied by each RAM lookup. See, e.g., Shimeng Yu, "Semiconductor
Memory Devices and Circuits", Section 1.3.

Structurally, CAT makes sure that all of these bit operations are
counted. The subroutines in CAT are then designed to reduce bit
operations, reflecting what serious attackers will do. For example, a
naive series of lookups in a RAM bank would have quadratic cost (number
of lookups times number of RAM entries), but it's easy to see that for
tasks such as collision-finding one can do much better. As the first
step, splitting a RAM bank into two smaller banks allows two parallel
accesses to the two banks for about the same total cost, after a minor
cost of sorting the two accesses. For finding collisions among many
items, sorting ends up as the main bottleneck, and there's a long
literature on algorithms to sort many inputs efficiently in hardware,
such as Batcher sorting. CAT accounts for all of this, fully specifies
algorithms all the way down through bit operations, analyzes the
algorithm cost in detail, and tests the analyses.

For comparison, the pre-CAT estimate under discussion doesn't do any of
this work. It simply declares that the cost of finding collisions in two
L-entry lists is

2 * L + L ** 2 // 2 ** l

(to quote "_list_merge_complexity"). This is counting the number of
inputs plus an average number of output collisions. That's nowhere near
the number of gates for a circuit to actually _find_ collisions.

This algorithm-analysis error has nothing to do with any controversy
regarding how to account for the cost of long-distance communication,
which can make a much larger difference in exponents but would require
looking beyond gates. CAT explicitly avoids this: it simply counts
gates, and does so consistently across all attacks.

If one switches from (24646,12323,134) to, e.g., (3488,2720,64), then
CAT and the pre-CAT estimates both favor much larger list sizes. This
makes error #1 smaller---linear algebra is much less noticeable---while
making error #2 larger. This removes the coincidental near-cancellation
between the errors: the pre-CAT estimates end up with totals about 10
bits below actual gate counts.

---D. J. Bernstein (speaking for myself)
signature.asc
Reply all
Reply to author
Forward
0 new messages