Leo Ducas writes, in email dated 1 Nov 2021 04:02:37 -0700:
> I am afraid that your counter-counter-analysis is not sufficient to
> dismiss the counter-analysis we brought.
Presumably the wording here is referring to the following publications:
* P0, "analysis": my 20 August 2021 talk on S-unit attacks.
* P1, "counter-analysis": the 24 Aug 2021 07:45:56 -0700 message
from Dr. Ducas and Dr. Pellet-Mary, 511 words (centerpiece:
claiming that the attacks have "*ridiculously* small" success
probability, dropping exponentially with #S).
* P2, "counter-counter-analysis": my 23 Oct 2021 21:55:07 +0200
message, 1510 words.
* P3, "counter-counter-counter-analysis"?: this 1 Nov 2021 04:02:37
-0700 message from Dr. Ducas, 267 words.
The core problem is that P1 applies standard lattice heuristics to
S-unit lattices without ever bothering to check whether S-unit lattices
are exceptions to the heuristics. The conclusions of P1 are
qualitatively and quantitatively wrong. An erratum for P1 is required.
Application of the same heuristics had already prevented a 2019 paper,
PHS19, from seeing most of the power of S-unit attacks. The research
project that led to P0 discovered, inter alia, that S-unit lattices have
a special analytic structure that PHS19 had missed. P0 announced this,
and full details now appear in
https://cr.yp.to/papers.html#spherical.
There are other important things that PHS19 missed about S-unit attacks,
but this is the most fundamental.
It's understandable that PHS19 missed this. Overconfident application of
the same heuristics was already standard practice in the literature on
lattice-based cryptography, not something new in PHS19. But it's less
understandable to see P1 continuing to mis-model S-unit attacks _after_
P0 announced that S-unit lattices have special analytic features.
P2, published shortly after
https://cr.yp.to/papers.html#spherical, is
structured as point-by-point responses to P1, identifying the points of
agreement (e.g., "It's correct that _if_ one applies standard lattice
heuristics to S-unit lattices _then_ one has to conclude that increasing
S beyond PHS19 makes S-unit attacks exponentially less effective") and
the points of dispute (e.g., P1 says "With ... standard heuristic, we
infer that the proposed algorithm is no better than existing literature
[PHS19]"; P2 says "
https://cr.yp.to/papers.html#spherical collects
extensive evidence that _for S-unit lattices_ the standard heuristics
are highly inaccurate").
The quickest way to see that P1 is wrong is to consider the case K=\Q,
with the primes in S chosen to be exactly the primes <=y, and with
simple reduction using the logs (and minus logs) of those primes, as in
Section 6 of
https://cr.yp.to/papers.html#spherical:
* These S-unit attacks succeed with probability 1 at finding a
shortest nonzero vector in the input ideal.
* P1 instead says "*ridiculously* small" probability for large #S
(i.e., large y), dropping exponentially with #S. That's wrong.
Of course K=\Q can be handled efficiently in other ways, but this has no
relevance to P1's errors. The idea that one can somehow ignore the K=\Q
counterexample by waving vaguely at class groups is debunked by Section
7 of the same paper: for _each_ number field K, as y->infty, with the
finite part of S chosen as the prime ideals of norm <=y, with simple
reduction modulo the y shortest S-units (the tie-breaking mechanism
doesn't matter), the success probability of finding a shortest nonzero
vector in the input ideal converges to 1. This again directly
contradicts P1's claim of "*ridiculously* small" probability.
Note that finding the _shortest_ nonzero vector is overkill for solving
_approximate_ Ideal-SVP, the problem that matters for cryptography. This
ends up letting the attacker take smaller S. There's much more material
in
https://cr.yp.to/papers.html#spherical _quantifying_ how wrong the
standard heuristics are. But taking y->infty is the easiest way to see
how thoroughly P1 botched the analysis of S-unit attacks.
Does P3 address these glaring contradictions between P1 and reality? No.
I think it's fair to summarize P3 as merely (1) briefly spelling out the
thought process that led to P1 and then (2) briefly claiming to be
disputing something, without pinpointing the text under dispute.
I'll now go point by point through P3, again identifying the points of
agreement and the points of dispute, and showing how various errors in
P3 were already debunked by
https://cr.yp.to/papers.html#spherical.
Errata are now required for both P1 and P3.
> Indeed, the following fact trivially
> generalize to arbitrary x over the sphere, and only random y:
> >> in dimension d, if x,y are uniform on spheres of radii a, b respectively
> >> (with a > b), then the probability that y reduces x (i.e., ||x-y|| <
> >> ||x||) is (1-(b/2a)^2)^{Θ(d)}.
As a preliminary note on the order of Theta quantifiers, I presume the
intention here was to fix a and b as d goes to infinity. Without such
restrictions, the statement is incorrect: for example, if b/a goes
rapidly enough to 0 as d increases then (1-(b/2a)^2)^Theta(d) = 1-o(1).
See Theorems 3.13 and 3.18 of
https://cr.yp.to/papers.html#spherical for
statements that allow a and b (and, more to the point, the ratio b/a) to
vary with d. This is useful for the following analysis.
Qualitatively, (1-(b/2a)^2)^{Θ(d)} is close enough for understanding the
relevant phase change in reduction effectiveness in spherical models as
the distances change---I'll refer back to this below:
* If the distance ratio b/a is 1/2 then the probability drops
exponentially with the dimension d. (See the upper bound in Theorem
3.18.)
* If the distance ratio b/a is 1/sqrt(d) then the probability is
>=1/poly(d) as d grows. (See the lower bound in Theorem 3.18.)
Regarding the trivial generalization: Yes, rotational symmetry means
that the probabilities work for one uniform random sphere point and one
fixed sphere point. See, e.g., Theorem 3.13, where mu is a random
variable while nu is fixed.
> This means, that the set S = {Log(u/v)} can be arbitrary, and only the
> target t = Log(g) needs to have a random direction.
Given the context (see "uniform" above), I _presume_ that "random" here
is intended to mean specifically "uniform random". Please avoid
ambiguous terminology.
"Can be" and "needs" indicate that this sentence isn't claiming that
Log(g) _does_ have a uniform random direction, but is instead saying
that P1+P3's desired conclusion (in a nutshell, that S-unit attacks for
large S don't work) _would follow from_ Log(g) having a uniform random
direction.
The reason that it's important to clarify the epistemological status is
that, in fact, the target _doesn't_ have a uniform random direction.
It's not even close, with regard to the relevant metrics. See below.
Anyway, I agree that the statements above regarding distance ratios work
when one of the two directions is uniform random; again, see Theorem
3.13 of
https://cr.yp.to/papers.html#spherical. However, P3's emphasis
on this point is simply tweaking an irrelevant detail of how the
"*ridiculously* small" analysis was stated in P1.
> The lower-bound on the output length of the iterative slicer stands by
> union bound.
Quantification is missing here. If the claimed "lower-bound" is anything
like P1 then it's predicting that the vector length found by S-unit
attacks becomes longer and longer as y increases; but that's simply
wrong. See Sections 6 and 7 of
https://cr.yp.to/papers.html#spherical.
> The spherical model for lattices is only necessary to prove
> upper-bound on this output. In other word: the lattice being
> non-spherical can only makes this particular algorithm worse, not better.
This conclusion is disproven by Z^d, where the shorter vectors than
predicted also make reduction work better than predicted; see Section 4
of
https://cr.yp.to/papers.html#spherical. The paper, with more work,
also shows that S-unit lattices have shorter vectors than predicted and
that S-unit attacks are much more effective than predicted.
> So, for the lower bound we brought up, the randomness of the target
> suffices.
Regarding "suffices", see above regarding "needs".
> Because the algorithm fails to reach the desired length for
> random spherical t, it also fails in the worst case.
Here P3 appears to be claiming that the actual log vectors _are_
distributed like "random spherical t" (whereas earlier P3 was merely
saying that P3 _wants_ them to be distributed that way).
I see no justification for this claim anywhere in P3. Let's look at how
one might attempt to justify this claim, and then look at how
https://cr.yp.to/papers.html#spherical already debunks the claim.
A spherical model of the S-unit lattice predicts more than just vector
lengths: it also says that the vectors are pointing in uniform random
directions. This is rotationally symmetric, so _if_ one uses a spherical
model _then_ for a reduction analysis one can freely apply a uniform
random rotation to the actual target vector being reduced, obtaining a
modeled target vector pointing in a uniform random direction.
What happens if one recognizes that this model of the S-unit lattice has
been debunked, and replaces it with the actual S-unit lattice? Suddenly
there are the actual lattice vectors, the logs of S-units, pointing in
their actual directions. This isn't rotationally symmetric. This breaks
the equivalence between
* the actual target vector and
* a uniform random rotation of that vector.
What, then, is the justification for the "[uniform] random spherical t"
hypothesis supposed to be? Forget that it came from spherical models of
lattices, and treat it as a free-floating axiom?
The lack of justification doesn't by itself imply that the hypothesis
is wrong. Let's move on to what the paper says about this hypothesis.
To review the main paper contents first: There are eight core analyses
in
https://cr.yp.to/papers.html#spherical, addressing four types of
lattices (Z^d as a warmup, and then all three directions in the space of
S-unit lattices) times two questions of
(1) vector lengths and
(2) reduction effectiveness,
in each case showing that the standard heuristics are very wrong.
#1 doesn't care about the directions. #2 does care, and the paper has
various comments about this. It's easiest to see the flaw in the "random
spherical t" hypothesis by considering a simplified version of #2.
This brings me to Section 7.3 of the paper. Section 7.3 (like various
other parts of the paper) considers the actual S-unit lattice, not a
model of it. Section 7.3 explicitly addresses a simplification of the
question of reduction effectiveness---it asks
how close this point is to the S-unit lattice, without regard to the
cost of finding a closest lattice vector
---but draws an important distinction between "two different probability
spaces for the random point". The two specified probability spaces are
two different models of a log vector, where
* the second model accounts for ord_P being an integer and
* the first model fails to do so.
Section 7.3 concludes that the second model has bounded distance from
the S-unit lattice as as y->infty, while the first one doesn't. The
analysis of the second model applies to the actual log vectors of any
distribution of elements of K^*, so the first model is simply wrong.
Section 7.3 closes with a paragraph noting that the problems with the
first model apply to spherical models: "For comparison, a spherical
model is invariant under rotation, so the point being reduced is
equivalent to a uniformly distributed point nu on a sphere. ... This
leads to the incorrect conclusion that that almost all reduced points
are farther and farther from the S-unit lattice as y->infty. This
conclusion relies critically on the inaccurate model of Log v as having
essentially arbitrary real coordinates, a model that fails to account
for the number-theoretic structure of Log K^*."
This is a 58-page paper. I noted in a separate pqc-forum message that it
was fine if the authors of P1 needed time before issuing an erratum. But
what we're seeing here is P3 attempting to defend P1 by adopting a
"random spherical t" model that's specifically debunked in the paper. P3
doesn't dispute what the paper says here; it simply ignores it.
> And that's what we are typically after when we talk about ideal-SVP.
The target under discussion here is indeed worst-case Ideal-SVP.
> A relevant average-case is also defined in [BDPW20] if you care.
As a historical note, Cohen's 1993 textbook on computational algebraic
number theory is already perfectly clear regarding the relevant
directions of randomization. The applicable number-theoretic notions are
older than this (and older than Arakelov's papers, btw). GRH-based
proofs have some value but are weaker than what's observed in
experiments to work.
> At a higher point of view, it seems that your mistake is a confusion
> between SVP and CVP; or purely geometrically, between the minimal
> distance and the covering radius.
This attack fails to pinpoint any text that it claims to be disputing.
> You brought up Z^n as a valuable
> example. Yes, Z^n has a minimal distance of 1. However, it's covering
> radius is still Θ(sqrt(n)).
As Section 4 of the paper says for Z^d, "iterating the reduction process
reduces any lattice point to 0, and reduces a general vector to the box
[−1/2,1/2]^d, with length at most d^{1/2}/2".
Let's now review how to quantitatively and qualitatively botch the
analysis of reduction for Z^d. See below for the analogy to how P1 and
P3 have botched the analysis of S-unit attacks.
First step: Apply standard heuristics to predict that the minimum length
of nonzero vectors is (1+o(1))(d/C)^(1/2) as d goes to infinity, where C
is the constant 2 pi e.
[This prediction is highly inaccurate. The actual minimum length is 1.]
Second step: Use these predictions of short-vector lengths to draw
conclusions regarding how effectively those vectors reduce a uniform
random target of a particular length.
Specifically, take a vector v of length (1+o(1))(d/C)^(1/2). What's the
chance that v successfully reduces a uniform random target of length
(10+o(1))d^(1/2)? Answer: "*ridiculously* small", namely exp(-Theta(d)),
since the distance ratio b/a is constant+o(1).
How about a target of length Theta(d^(3/4))? This time the distance
ratio b/a is Theta(d^(-1/4)), so the reduction probability is
exp(-Theta(d^(1/2))). To obtain a high reduction probability one would
need exp(Theta(d^(1/2))) vectors of this (1+o(1))(d/C)^(1/2) length.
Even with that many vectors, reduction wouldn't get much _below_ length
Theta(d^(3/4)); it wouldn't have a >=1/d^O(1) chance of reaching
Theta(d^(0.74)), for example. [At least formally, care is required here
regarding how the distribution changes across multiple reduction steps.
It isn't necessary to get into this for the dispute at hand.]
To summarize, the heuristics predict that, with exp(Theta(d^(1/2)))
vectors, reduction will obtain length at best roughly Theta(d^(3/4)).
[This prediction is also highly inaccurate. Simple reduction using the d
standard basis vectors and their negatives produces an output of length
at most d^(1/2)/2.]
Third step: Model the actual target vector as pointing in a uniform
random direction---which, again, is effectively guaranteed if one uses a
spherical model of the lattice. Based on this, translate the above
predictions regarding reduction of a uniform random target into
predictions regarding reduction of the actual target.
[This prediction degrades accuracy even more in contexts where the
actual target vector is, for example, a lattice vector: a lattice vector
is reduced to a length that's bounded as d increases, namely length 0.]
The simplest way to see that this botched analysis of Z^d is analogous
to what we've seen for S-unit lattices is to observe that S-unit attacks
for any particular field produce outputs whose length is bounded as d
increases (see Section 7 of the paper), whereas P1 claims that the
success probability for large S is "*ridiculously* small", dropping
exponentially with d. This analogy focuses on what goes wrong in the
third step; with more work, one can also see that the first and second
steps are wrong for S-unit lattices. See the paper.
> The average distance to Z^n of a random point modulo Z^n is also
> Θ(sqrt(n)).
Yes, uniform random points are equivalent to worst-case points at this
level of detail.
> In fact, the constant hidden in the Θ is even worse for
> Z^n than it is for random lattice.
Given the systematic suppression of constants earlier in P3, it's
puzzling to see P3 highlighting this particular constant. Anyway, the
errors under dispute are asymptotically much larger than any constant.
> I hope this clarifies things.
Looking forward to the errata for P1 and P3.
---Dan