Hello,
I am afraid there are a few mistakes in
http://eprint.iacr.org/2016/296.pdf , section 4.1.
The average number of tries to find a n bit vector whose coordinates are sampled according to independent Bernoulli distribution with parameter p=Theta(n^(-0.5)) is neither approximated (!) by 2^(n(1-1/2ln 2)+O(n^0.5)) or upper bounded (!) by exp(n^0.5*ln^2 n/2+O(n^0.5 ln n)).
For p=n^.5, it is rather exp((2+o(1))n^.75).
This can be checked by the following pari-gp program, which computes a close approximation of the wanted value :
s=0;pr=(1-p)^n;for(k=1,n,pr*=((n-k+1.)/k)^2*p/(1-p);s+=pr);log(s)
The maximum of pr is reached in k around n^.75 for a clear reason ; and standard approximation formulas allow to quickly get the claimed result.
I recall that :
- tails of binomials are not tails of Gaussian (!)
- tails of binomials can be simply computed with Stirling's formula : binomial(n,nx) is around 2^(nh(x)) with h the entropy function (h(x)=-x*log(x)-(1-x)*log(1-x))
- sum(i=0,n*x,binomial(n,i))<=2^(nh(x)) for any x<=1/2, which can easily be proved from (x+(1-x))^n=1
- with high probability, the number of tries is only exp(O(n^.5*log n)). (!)
For those confused by the current situation on LPN, the fastest knowns algorithms runs in time min((1-p)^(-n),2^(n/log(n/log(1-2p)))) (without lower order terms) reached by respectively brute force search and the original BKW.
Newer algorithms are only decreasing lower order terms.
When the samples are limited to a linear number, LPN is (almost exactly) the random linear code decoding problem.
When there are m=n^(1+Omega(1)) samples, we can use Lyubashevsky's reduction followed by BKW to run in time 2^(n/log(log m/log(1-2p))) - and in particular, time 2^O(n/log log n) for constant p and polynomial m.
(All claims that "BKW can't be run because there are few samples" are as wrong as widespread (!), for LPN and LWE)
Paul.