Dear all,
We’ve gotten a lot of questions from submitters of lattice schemes, regarding what parameters we think meet security strength category 1. The short answer is that we aren’t sure yet – we think there are still scientific questions that need to be resolved. We're interested in having the best possible estimates of the security of each scheme; but we also realize that, to compare different schemes, we have to evaluate them using a common methodology. In round 3, we are aiming to find the right balance between these two goals. As we’ve asked submitters to be ready with tweaks by October 1, and we are aware that several of the teams are working towards better understanding the true costs of lattice reductions, we think it would be good to share our preliminary thoughts at this time.
At this point, we think the submitters are better able than us to precisely quantify the best known attacks along the various dimensions of number of bit operations (gates), total memory used, number of reads and writes, and memory access patterns (local vs nonlocal). (Note here that we expect classical attacks to be the deciding factor for whether any of the Round 3 candidates meets category 1.) One place where we feel we might be able to add a bit to the discussion is how to combine these numbers into a single cost estimate.
There have been various cost metrics in the literature for doing this. The most common metrics we’ve seen are
These models have the virtue of being fairly simple, but based on fundamental physical considerations, there are various aspects of each that suggest they may overestimate or underestimate real world costs of some attacks. An alternative approach, given that a category 1 classical attack is “only” something like 6 to 10 orders of magnitude more expensive than what a nation state adversary might be capable of, is to extrapolate from current technology.
We can start out by assigning a dollar cost to NIST’s estimate that brute force AES key search requires 2^143 gates. We’re pretty sure any attack on that scale would need to use custom hardware, like what’s used in bitcoin mining. By doing a little comparison shopping for bitcoin mining equipment, we estimated that the cost of 2^143 bit operations was about 2^64 US$. Most of this went towards paying for energy. The amount of energy involved is roughly what could be collected over the course of 500 years if the earth’s entire surface was tiled with solar panels.
For $2^64 / 500 years, we estimated that something like 2^95-2^100 bits of memory could be maintained. Most of this cost was the cost of periodically replacing hardware, although the cost of power, e.g. assuming the technology in question was hard drive, was sometimes within an order of magnitude.
Given that heat dissipation considerations and the scale of the attack suggest that computing units making random access queries would be distributed around the globe, we found the limiting factor on memory access would probably be long distance communication, e.g. via fiber optics. The cost was somewhat difficult to estimate, so we had a wide range of estimates regarding how much memory bandwidth could be afforded, but we think it’s fairly safe to say that it’s somewhere in the range of 2^105-2^120 bits worth of random access memory queries, with hardware costs exceeding energy costs by a couple of orders of magnitude. Sources of uncertainty included how much installation costs could be amortized away, how long various components last, and whether we should estimate based on market prices for commercial products (which typically transmit less than 100Gbps on a single fiber optic) or based on wildly guessing the cost, if mass produced, of experimentally demonstrated hardware that can transmit 10s of Tbps on a single core fiber optic cable or 100s of Tbps on a multicore fiber optic cable. Note, at the high end, local memory access speeds start mattering.
This left us fairly confident that, for borderline category 1 attacks where memory costs make up a significant portion of the cost of an attack, the RAM model would almost always underestimate the cost, the time*area model would almost always overestimate the cost, and the time*volume may either be an underestimate or an overestimate, depending on the details. For reference, the RAM model suggests no limit on memory size or memory bandwidth below 2^143 bits for category 1, and the time* area model suggests a limit of 2^95 bits of accesses to a memory of size 2^95 bit or 2^105 accesses to a memory of size 2^76. The time*volume model suggests a limit of 2^107 accesses to a memory of size 2^107 or 2^110 accesses to a memory of size 2^100. Additionally, we expect both time*area and time*volume to overestimate the cost of memory access when the memory size per processing unit is large or the memory is infrequently accessed.
We can also consider the latency cost of random memory access on a global scale. Two random points on earth are about 10000 km apart. Sending a request and a reply 10000 km would take at least about 2^-4 seconds. This means that for random access memory, in 500 years we could compute to a RAM depth of up to 2^38. Let’s say we model depth by treating RAM access as a cubic nearest neighbor circuit. Assuming a memory size of 2^99 cubic nearest neighbor model would model the depth of the request and response as having depth 2^34, yielding a total depth of 2^72.
Overall, we’re open to arguments that a given attack requiring fewer than 2^143 gates when costed in the RAM model might not contradict category 1 security if for example, the depth of the computation (assuming a 3-d nearest neighbor arrangement) exceeds 2^75, or the number of bits written or read from random access memory exceeds 2^120, or the total size of the memory exceeds 2^100. We would of course need to be convinced that the attack was well optimized given the above criteria, e.g. that the attack really requires random access to memory as opposed to a more local memory access pattern. We’re also not sure how these limits should be extrapolated to the higher security strength categories. A rough guess is that everything other than depth should scale linearly in the exponent, but we’re even less sure about that than everything we’ve previously said.
Attached are slides from a recent internal presentation where we discussed these issues. (Labeling has been added to slide 1, references have been added at the end, a commercial disclaimer has been added to slide 16, and a typo in an equation on slide 19 has been corrected. The slides are otherwise identical.)
We hope to be able to get a better understanding of the concrete analysis of lattice attacks, so we can determine how much the real world costs of these attacks are increased by memory considerations. Any comments on our thoughts thus far would be greatly appreciated.
Thanks,
NIST PQC team
Hi Watson,
> I'd hate to see NIST enshrine a signature scheme prematurely
I agree. And I would like to see PQ Signature Finalists to come up with new
Level 1 parameters with ~128 bits classical security.
But it would benefit the community to have a PQ signature scheme at the end of
Round 3 imo.
Below are some issues with your suggestion:
> TLS worked with a RSA in a KEM like way in TLS 1.2, TLS 1.1, etc. It can
> work that way in the future,
PQ KEMs cannot serve in place of RSAEncrypt for PKI certs and cert chains. The
verifier will not be able to decapsulate the ciphertest in the cert.
> there has been discussion of what is necessary to do this because of the
> problems with signature schemes already. We at $DAYJOB have deployed
> mechanisms that can use offline signatures on short lived KEM if for some
> reason certificates cannot be made with KEMs (but again, RSA certs were a
> thing). Furthermore
I am not sure every TLS usecase will want to pick potential mechanisms you
have in mind.
Also, I am not sure about the details of these mechanisms, but on a similar
note https://eprint.iacr.org/2020/534 proposes KEMTLS with static and
ephemeral KEM keys. KEMTLS introduces changes to TLS which can definitely not
be deployed trivially, but even then the cert chain still uses PQ signatures.
Additionally, let's not forget there are other long lived usecases that use
signatures and need PQ ones. One example is signed BIOS that lives without
being upgraded and would rather see PQ signatures standardized sooner than
later.
I think I understand your point better now. You were not saying that NIST should not standardize a PQ Signature scheme for TLS. You were suggesting that NIST should not feel pressure to standardize an insecure PQ Signature scheme with fast signing for TLS because we can introduce KEMs for auth instead (similar to what draft-ietf-tls-semistatic-dh does for Diffie-Hellman).
I don’t think PQ KEMs in TLS are straightforward which I explain more below, but regardless of that I would rephrase your point to “NIST should not feel pressure to standardize any insecure scheme”.
> A new signature scheme is not trivial to deploy either, although slightly easier. We've seen substantial changes to TLS over the past 5 years get deployed. Why must TLS always use online signatures when KEMs could be faster? It doesn't!
I would argue that introducing PQ KEMs in the end entity cert is not as simple as with Diffie-Hellman in draft-ietf-tls-semistatic-dh. PQ KEMs in the leaf cert PK would automatically mean an extra round-trip (for the peer to get the PK before encapsulating something), an altered state machine and potentially a different security analysis for the protocol; all with doubtful benefit. The last paragraph of [1] Section VII-C talks about this briefly. Additionally, changes like the ones in [2] with KEMTLS are even more drastic imo.
I am pretty sure no one wants to jump into standardizing a drastically different or slower TLS handshake that uses KEMs for auth unless we definitely have to. [1] shows that there are a couple of options from the NIST Round 3 Signature candidates that could do satisfactorily in TLS (certs + SCTs + OCSP) (assuming the algorithms are secure of course).
Anyway, we can have this discussion in the IETF TLS WG in due time, no need to address it here.
Dear all and NIST team,
On slide 18 of Ray’s talk (“Cost Models”), it is assumed that an attacker has a budget of $2^64 over a period of 500 years. To calculate the memory budget, this becomes an annual budget of $2^64/500 that leads to having access to up to 2^96.5 bits of HDD memory over the period of 500 years.
However, the analysis of some (if not most) attacks in the literature assumes access to the necessary hardware from the very start, when launching the attack (otherwise the time to carry out an attack successfully would need to take into the addition/upgrade of hardware for CPU and memory as time goes by).
If we fix the goal of breaking AES128 to 500 years, I estimate that one needs
2^128 * (12.7 * 10^-9 sec) / (500 years * 365 * 24 * 3600 sec/year) = 2^67.89 AES engines,
and this number of engines costs about 2^67.89 * 11590 GEs * $6.57*10^-9/GE = $2^54.21, assuming that an AES computation runs in 12.7nsec, occupies 11,590 gate equivalents (GEs) and that a GE costs 6.57*10^-3 microcents each [1]. Therefore, an attacker would need $2^54.21 to launch the attack ($2^64/500 per year was actually not a bad approximation if we assume that’s the starting budget at day 1).
Now let’s look at the cost of attacking some other scheme X with this initial budget ($2^54.21). For simplicity, let’s assume that for breaking X half of the money goes to get computing power and half of the money goes to memory resources (this of course depends on the scheme and the attack). In this case, we get at most
2^53.21 * 2^47 bits / $400 = 2^91.6 bits at launch time (compare to the 2^96.5 bits on slide 18). So there might be (at least) a 5-bit difference here, and this appears to affect all the memory estimates on slide 18.
OTOH, (back to the AES analysis) if we look at the energy costs for the 500-year run, I get
2^128 * 2.02*10^-6 kW *(12.7 * 10^-9 sec)/(3600 sec/hour) * 0.10 cents/kWh = $2^67.7, which exceeds the total budget of $2^64. This doesn’t include other costs that NIST takes into account in the analysis (e.g., the cost of replacing HDD hardware every 5 years, and the energy cost of the memory units).
How is NIST calculating the energy cost for AES128 exactly? It’s somewhat unclear by just looking at slide 17.
(*) The 2.02mW figure for AES was provided by Rei Ueno, based on their implementation [2].
Apparently, some differences in all these calculations can be due to the numbers we use for AES (actual AES results for area, timing and energy from the implementation in [2] versus NIST’s AES gate complexity of 2^143).
Overall, I don’t see huge differences between NIST’s numbers and our analysis ([1] presents an analysis of this nature focused on SIKE). But maybe the relatively small differences are still something to think about, given that only a few bits might make the difference between a parameter set matching a particular security level or not. For some schemes slight differences in the assumptions can change the results of the analysis (this is especially true for attacks that require a large shared memory and whose costs rely heavily on the CPU:memory cost ratio and the budget that are assumed).
Any comments?
Best,
Patrick
[1] P. Longa, W. Wang and J. Szefer. “The Cost to Break SIKE: A Comparative Hardware-Based Analysis with AES and SHA-3”, 2020. https://eprint.iacr.org/2020/1457
[2] R. Ueno, N. Homma, S. Morioka, N. Miura, K. Matsuda, M. Nagata, S. Bhasin, Y. Mathieu, T. Graba, and J.-L. Danger. “High throughput/gate AES hardware architectures based on datapath compression”. IEEE Trans. Computers, 69(4):534–548, 2020.
--
You received this message because you are subscribed to the Google Groups "pqc-forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
pqc-forum+...@list.nist.gov.
To view this discussion on the web visit https://groups.google.com/a/list.nist.gov/d/msgid/pqc-forum/SA0PR09MB7193FB1D6B1D59091865C9549C5F0%40SA0PR09MB7193.namprd09.prod.outlook.com.
Hi Patrick.
Thanks for checking my calculations. Overall it seems like we’re getting fairly similar numbers. I’m less convinced that a bit or two really matters here, since what we’re actually worried about is technology a couple of orders of magnitude better than current technology – i.e. technology that might make breaking a category 1 scheme close to feasible. I think we can agree that no one is seriously worried that their keys are going to be cryptanalyzed after 500 years if someone just manages the simple task of blanketing the entire planet, land and sea, with solar panels. Given that we’re really worried about hypothetical future technology, it seems entirely plausible that the cost of computation and the cost of memory could fall by different amounts. Thus I see this exercise mostly as a sanity check to see whether an amount of memory, claimed to be prohibitive for the attacks relevant to a given security strength category, is actually prohibitive.
On specific points: The assumption that all hardware must be purchased in the first year of an attack lasting 500 years seems really weird to me. Also, as you said, the exact portion of the budget which should be allocated to memory as opposed to computation need not be 50%. This doesn’t make as big a difference as one might think, though, since the assumption I used to do my calculation was that hard drives would wear out and need to be replaced after 5 years. Additionally, re-running the numbers, I got something closer to 2^95.5 when I multiplied out the equation on slide 18, so it seems there was a typo there. You also asked about the calculation on slide 17: This was based on the specs for the bitcoin mining platform referenced earlier in the slide and the assumption that a “hash” was a double-SHA2 operation, which was assumed to be about 2^19 gates. This gives:
2^143 gates * 1 hash / 2^19 gates / ( 1.1x10^14 hash / s) * 3.25 kW * $0.1/(kW h) * (1 h / 3600s) = $2^63.92
Finally, I should note that while the small differences in our estimates for things like the energy consumption of a single AES computation on optimized hardware don’t make very much difference, what does make a sizable difference is the methodology for how one deals with the fact that category 1 type attacks are several orders of magnitude beyond both the budget and timeline of real world adversaries, at least with existing technology. For example, the approach in the eprint paper you cited: https://eprint.iacr.org/2020/1457 is to fix the adversary’s hardware budget at something basically on a human scale (a million to a billion dollars) while allowing attack timelines to stretch as long as they need to in order to compensate, even if that number turns out to be unfathomably large. For reference, the shortest attack contemplated by this paper was over 250,000,000 years long, or about 1000 times the amount of time anatomically modern humans have existed. The problem with doing things this way becomes evident when you look at the way the paper combines this model with extrapolations of technological progress. The paper extrapolates technological progress out 20 years, to 2040, or about a ten-millionth the length of the shortest attack contemplated by the paper. This brings the timeline for a billion dollar AES128 attack down from 8 billion years to 250 million years. Obviously the attacker would be better off waiting for technology to improve. (Also, if the adversary had invested the billion dollars in the stock market rather than rushing out to buy hardware, assuming historical rates of return, he would have something like a 4 billion dollar hardware budget in 2040 rather than 1 billion.) But, I see no reason to think 2040 would be an optimal time to start buying hardware either. Most likely, it would be optimal for the attacker to wait for something like 80-100 years if technological improvements and investment returns continue at the same exponential rate, or longer if they slow down, resulting in an attack that takes decades rather than millions of years. This can significantly change the comparative cost of memory-intensive attacks like the classical attacks on SIKE. The paper says SIKE 377 would take 2^7 times as long to attack as AES 128 with technology frozen in 2020, but only 2^4 times as long with technology frozen in 2040. If we blindly extrapolate these exponential trends to 2120, we’d find the time to attack AES with technology frozen at that point with a billion dollar budget would be something like 250 years, while the time to attack SIKE 377 would be something like 1 year. (Note: This is a back of the envelope calculation based on exponents rounded to the nearest whole number. I expect using more precise numbers from the paper will result in slightly different numbers, but I don’t expect it to change the overall conclusion that, barring cryptanalysis improvements, SIKE 377 is likely to be within the reach of an adversary with both a reasonable budget and timeline before AES128.)
Best,
Ray
Hi Ray,
Thanks for the reply.
> The assumption that all hardware must be purchased in the first year of an attack lasting 500 years seems really weird to me.
What I meant was that the hw available on day 1 should be the one used to estimate the time to run the attack. I agree with the idea of taking into account the cost of replacing hw that breaks as time goes by, but if you want to consider the possibility of “upgrades” (future faster hw, more memory, etc.) then the analysis gets much more complicated.
> Finally, I should note that while the small differences in our estimates for things like the energy consumption of a single AES computation
> on optimized hardware don’t make very much difference, what does make a sizable difference is the methodology for how one deals with
> the fact that category 1 type attacks are several orders of magnitude beyond both the budget and timeline of real world adversaries, at least
> with existing technology.
Yes, this is what I meant by “for some schemes slight differences in the assumptions can change the results of the analysis”. Given the divergence of results depending on the assumptions for a scenario that is purely hypothetical, the main question is then on how to decide what those assumptions are.
In our paper, we decided to try something that we think is (slightly) closer to reality than assuming an unrealistic budget with an unrealistic timeline. Basically, the convention was to use “somewhat manageable” budgets (we have evaluated from millions to up to trillions of dollars without too much change in the results) and then determine the time to break AES and SIKE, which gives unrealistic timelines as expected. Of course one can turn this around and decide to assume a reasonably-human timeline and then determine the cost of cryptanalysis.
> Obviously the attacker would be better off waiting for technology to improve…. If we blindly extrapolate these exponential trends to 2120, …
Of course. But then how is that relevant to determine the strength of AES and other algorithms today or during the next few decades?
It should be irrelevant if SIKEp377 is believed to be weaker than AES128 in year 2120 (is it reasonable to try to do a forecast that far in the future anyway?).
The larger issue boils down to the need of reasonably fair metrics for attacks that require very large shared memory. Should teams avoid more realistic models that use the cost of technology to analyze attacks or should they simplify the analysis with potentially imprecise theoretical costs (e.g. using a square root cost for memory accesses)? The risk is to penalize otherwise strong algorithms that have a strong cryptanalytic profile (whose best attacks are forced to use huge amounts of shared memory versus other schemes that fall to attacks that require negligible memory).
Best,
Patrick
Hi Patrick
I take it as axiomatic that an attack which begins in 100 years and ends in 101 is more relevant to practical security than one that begins in 20 years and ends 250 million years from now. I could see you arguing that what we’re really worried about are scenarios where an attack might finish even sooner than 101 years from now, and that’s fair. The only scenarios I can come up with involve a larger starting budget, hijacking computation resources without purchasing them, or cryptanalysis. I would expect the first two scenarios to behave more like my 101 year scenario than your 250 million year scenario in terms of the relative costs of attacking AES128 and SIKE377. Cryptanalysis is probably the biggest concern, but it’s hard to quantify the relative likelihood of a nontrivial attack on SIKE and AES.
Hi Ray,
It seems the discussion turns philosophical at this point 😊. The question for me is not about what scenarios we should worry about the most (because it’s clear that we assume level 1 is infeasible with current technological trends for several decades). From my viewpoint, it is more about determining (relative) security fairly, and for this goal it’s more relevant to evaluate strength as assessed today (or in the next few years), not in 100 years.
I suppose we agree to disagree.
> The larger issue boils down to the need of reasonably fair metrics for attacks that require very large shared memory. Should teams avoid
> more realistic models that use the cost of technology to analyze attacks or should they simplify the analysis with potentially imprecise
> theoretical costs (e.g. using a square root cost for memory accesses)? The risk is to penalize otherwise strong algorithms that have a strong
> cryptanalytic profile (whose best attacks are forced to use huge amounts of shared memory versus other schemes that fall to attacks that
> require negligible memory).
Do you have any additional comments on this aspect, beyond what NIST’s been saying on this forum?
All the best,
Patrick
To view this discussion on the web visit https://groups.google.com/a/list.nist.gov/d/msgid/pqc-forum/DM8PR09MB637497CBA6692241BF2152479CC00%40DM8PR09MB6374.namprd09.prod.outlook.com.