Security through slow primitives

Gian-Carlo Pascutto

ongelezen,

4 apr 2007, 02:08:5504-04-2007

aan

Hi all,

finding a pre-image or second-preimage for a SHA-256 hash
that is truncated to 64-bits should take 2^64 SHA-256
applications.

Given that a modern PC can do about 2^20 per second, this
attack is infeasible. (I'm assuming an attacker no bigger
than a single individual with limited budget here)

However, due to space constraints we only transmit the
last 32-bits of the SHA-256 hash. The attack only needs
2^32 operations, meaning 2^12 time, and it's done in
about 1 hour of work.

Could we increase the security of this small hash by
slowing down the operation with which it is generated?

Say we redefine the hash as being the recursive application
of SHA-256, 65535 times. So H(x) = SHA(SHA(....SHA(x)...))

Am I correct that this will slow down the attackers by the
same factor 65535 and thus make the hash appear as strong as
a 2^48 bit hash? (Although obviously a bit slower to calculate,
and less resistant to a collission. I want resistance to
pre-imaging)

If there's a flaw in the reasoning, what is it?

--
GCP

Ertugrul Soeylemez

ongelezen,

4 apr 2007, 05:01:3704-04-2007

aan

Gian-Carlo Pascutto <nat...@hotmail.com> (07-04-04 06:08:55):

> Say we redefine the hash as being the recursive application of
> SHA-256, 65535 times. So H(x) = SHA(SHA(....SHA(x)...))
>
> Am I correct that this will slow down the attackers by the same factor
> 65535 and thus make the hash appear as strong as a 2^48 bit hash?
> (Although obviously a bit slower to calculate, and less resistant to a
> collission. I want resistance to pre-imaging)

Yes, theoretically it does. However, security is only increased
linearly. In other words, generating the hash is made harder by the
same factor for both the attacker _and_ the legitimate user. If you
have lots of CPU power available, this shouldn't be a problem. But read
on.

Let's define this iteratively instead of recursively. Let X_0 be the
message you want to hash, then the sequence would go

X_{i + 1} = SHA(X_i)

Up to generating X_1 there is no problem here. The problem starts with
X_2 and further iterations, since you restrict the set of input values
to GF(2^256). It's not known (or I don't know) whether

SHA: GF(2^256) -> GF(2^256)

is a bijection. If it's not, then this will add less than linear
strength, because you lose entropy after each iteration. I would not
consider this a major issue, but it's certainly something valuable to
know.

The next important thing to note is that you would need to double the
number of iterations to make the attacker's work harder. So I would not
suppose 65536 iterations to be enough, even for limited resources. If
security has more value than bandwidth, then you certainly want to use
more bits of the hash value.

As a side note, do not just truncate the hash value. Rather split it
into words of equal length and XOR them together.

Regards,
Ertugrul Söylemez.

--
From the fact that this CGI program has been written in Haskell, it
follows naturally that this CGI program is perfectly secure.

g...@sjeng.org

ongelezen,

4 apr 2007, 07:13:4404-04-2007

aan

On 4 apr, 11:01, Ertugrul Soeylemez <do-not-spam...@ertes.de> wrote:

> Up to generating X_1 there is no problem here. The problem starts with
> X_2 and further iterations, since you restrict the set of input values
> to GF(2^256). It's not known (or I don't know) whether
>
> SHA: GF(2^256) -> GF(2^256)
>
> is a bijection. If it's not, then this will add less than linear
> strength, because you lose entropy after each iteration. I would not
> consider this a major issue, but it's certainly something valuable to
> know.

If the SHA operation would lose entropy for data <= 2^256, wouldn't
this lead to an attack on it in less than 2^256 operations?

> The next important thing to note is that you would need to double the
> number of iterations to make the attacker's work harder. So I would not
> suppose 65536 iterations to be enough, even for limited resources.

Can you explain?

I benchmarked about 1 000 000 SHA operations per second on a modern
PC. This means 2^32 is crackable in about 1 hour. An extra workfactor
of
65535 means it takes 2730 days. If the attacker gets a quad core and
a well optimized SHA implementation, maybe it can be reduced to half
a
year, and a quarter if he's lucky in getting the right value fast.
But this is enough.

I'm scared of 'order of magnitude' mistakes and the process taking
5 minutes due to an oversight.

> As a side note, do not just truncate the hash value. Rather split it
> into words of equal length and XOR them together.

Again, can I ask why? I thought at first this was to make an
optimization
of the SHA function for smaller lenghts impossible, but that can't
help.
You could optimize iteration 65535 but 65534 remains just as slow.

--
GCP

Bericht is verwijderd

Greg Rose

ongelezen,

4 apr 2007, 08:31:1604-04-2007

aan

In article <XhHQh.101927$dn5.6...@phobos.telenet-ops.be>,

It actually doesn't make the search much longer at
all.

Start with a single value x, then start
calculating SHA(x), SHA(SHA(x)), etc. You expect
to evaluate SHA about 2^32 times (call the number
of evaluations "y") before you get the desired
32-bit output. Now start at x again, and count
y-65535 evaluations. The output of that is a
preimage according to your formulation. So it
takes about twice as long. (You can make it faster
by storing some intermediate results, so it really
isn't much slower at all.)

You can fix your formulation by making the SHA
evaluations "position dependent", say
H(x) = SHA(65535 || SHA(... SHA(1 || SHA(0 || x))...)
where "||" is concatenenation. Now the slide
trick doesn't work any more.

Greg.
--
Greg Rose
232B EC8F 44C6 C853 D68F E107 E6BF CD2F 1081 A37C
Qualcomm Australia: http://www.qualcomm.com.au

Ivar Rosquist

ongelezen,

4 apr 2007, 08:31:2704-04-2007

aan

On Wed, 04 Apr 2007 04:13:44 -0700, gcp wrote:

> I benchmarked about 1 000 000 SHA operations per second on a modern PC.

What is it exactly that you benchmarked? There is an obvious
dependency with the length of the input - computing the SHA-1 digest of a
1GB file is not going to be instantaneous. A more useful metric would be
the number of CPU cycles per byte for not-to-small a file.

So, what do you mean by 1 000 000 SHA operations per second?

Greg Rose

ongelezen,

4 apr 2007, 08:36:2004-04-2007

aan

In article <57hireF...@mid.dfncis.de>,
Sebastian Gottschalk <se...@seppig.de> wrote:
>Assuming SHA-256 is a perfect hash, then the mapping GF(2^256)->GF(2^256)
>is random and the, by statistics, about 1/e of all output values don't
>occur. This means that the hash, as good as it might be on preserving
>entropy, will lose about 1.5 bits on entropy if you don't overload it
>appropriately.

It's actually only a fraction of a bit lost, not 1.5 bits.

>Further assuming that SHA-256 is still quite perfect and this loss is
>uniformly distributed, then you'll lose about 1.5 / 256 / (256 / output
>size) bits on entropy on every iteration over truncated outputs.
>
>For 65536 iterations, you'll lose all your entropy.

No, sorry, that's not true (otherwise finding
collisions would be very easy). The problem is
that, for the second round, there are less than
2^256 possible inputs, and for the third round
still less, and so on. So for each repeated
evaluation, you lose less entropy, until (assuming
SHA-256 behaves well) at around 2^128 possible
outputs, all of them hash to unique values, and
then you stick there (which is expected to happen
after about 2^128 rounds).

Joseph Ashwood

ongelezen,

4 apr 2007, 08:12:4604-04-2007

aan

<g...@sjeng.org> wrote in message
news:1175685224.1...@p77g2000hsh.googlegroups.com...

> On 4 apr, 11:01, Ertugrul Soeylemez <do-not-spam...@ertes.de> wrote:
>
>> Up to generating X_1 there is no problem here. The problem starts with
>> X_2 and further iterations, since you restrict the set of input values
>> to GF(2^256). It's not known (or I don't know) whether
>>
>> SHA: GF(2^256) -> GF(2^256)
>>
>> is a bijection. If it's not, then this will add less than linear
>> strength, because you lose entropy after each iteration. I would not
>> consider this a major issue, but it's certainly something valuable to
>> know.
>
> If the SHA operation would lose entropy for data <= 2^256, wouldn't
> this lead to an attack on it in less than 2^256 operations?

You mean like the birthday paradox attack that happens at 2^128? Yes it
does.

>> The next important thing to note is that you would need to double the
>> number of iterations to make the attacker's work harder. So I would not
>> suppose 65536 iterations to be enough, even for limited resources.
>
> Can you explain?
>
> I benchmarked about 1 000 000 SHA operations per second on a modern
> PC. This means 2^32 is crackable in about 1 hour. An extra workfactor
> of
> 65535 means it takes 2730 days.

Ertugrul was saying that is 65536 takes 2730 days, to reach 5460 days would
require 131072. 131072 operations of course takes twice as long as 65536 for
the legitimate user as well, resulting in everyone paying the same 2x
penalty. Except that as you observe next, because the attacker can scale,
and the natural progression is towards multiple cores the attacker may
actually pay less of a natural penalty. Although the per core penalty
remains the same, because the legitimate user is wasting cores, and the
attacker isn't, the attacker pays less of a penalty.

>> As a side note, do not just truncate the hash value. Rather split it
>> into words of equal length and XOR them together.
>
> Again, can I ask why?

As far as is truly known it is pure superstition, but if you're adding
instructions, it's another instruction.

My recommendation would be to use the construct I gave a while back:
IV=0x000...000
for(iterations)
IV = HMAC_SHA-256(IV, passphrase) // HMAC(key, value)
output IV

This generates a significant number of operations of questionable
cryptographic value, but at the same time the operations won't hurt, and it
will not lose entropy (any more than a single SHA-256 operation would).
Joe

Gian-Carlo Pascutto

ongelezen,

4 apr 2007, 13:30:0004-04-2007

aan

SHA-256 hash on a 256 bit input (i.e. its own output).

Probably one can do it faster by optimizing the function for this case.
But I hope it should not make an order of magnitude difference.

Still it might be interesting to know how fast it can get (on general
purpose hardware).

--
GCP

Bericht is verwijderd

Gian-Carlo Pascutto

ongelezen,

4 apr 2007, 14:02:0804-04-2007

aan

Joseph Ashwood wrote:
> <g...@sjeng.org> wrote in message
> news:1175685224.1...@p77g2000hsh.googlegroups.com...
>> On 4 apr, 11:01, Ertugrul Soeylemez <do-not-spam...@ertes.de> wrote:
>>
>> If the SHA operation would lose entropy for data <= 2^256, wouldn't
>> this lead to an attack on it in less than 2^256 operations?
>
> You mean like the birthday paradox attack that happens at 2^128? Yes it
> does.

I think the birthday attack applies also to a bijection.

> Ertugrul was saying that is 65536 takes 2730 days, to reach 5460 days would
> require 131072. 131072 operations of course takes twice as long as 65536 for
> the legitimate user as well, resulting in everyone paying the same 2x
> penalty. Except that as you observe next, because the attacker can scale,
> and the natural progression is towards multiple cores the attacker may
> actually pay less of a natural penalty. Although the per core penalty
> remains the same, because the legitimate user is wasting cores, and the
> attacker isn't, the attacker pays less of a penalty.

Valid point.

The original idea was to reverse the situation like you describe.
The receiver only has to perform 1 evaluation, so it being 65535
times slower is hardly noticeable.

For an attacker that has an attack that takes a lot of work, getting
65535 times more work is a bigger problem.

--
GCP

Gian-Carlo Pascutto

ongelezen,

4 apr 2007, 14:22:4804-04-2007

aan

Sebastian Gottschalk wrote:

> Stop thinking like a fool, think like an attacker.

"This information is not worth the price of an FPGA!"

:)

While I agree that designing the security parameters
to be resilient against the NSA would be a good thing,
fitting in 32-bits is a requirement.

What I proposed is the best I know possible.

--
GCP

Bericht is verwijderd

Gian-Carlo Pascutto

ongelezen,

4 apr 2007, 15:19:3604-04-2007

aan

Greg Rose wrote:

> It actually doesn't make the search much longer at
> all.
>
> Start with a single value x, then start
> calculating SHA(x), SHA(SHA(x)), etc. You expect
> to evaluate SHA about 2^32 times (call the number
> of evaluations "y") before you get the desired
> 32-bit output. Now start at x again, and count
> y-65535 evaluations. The output of that is a
> preimage according to your formulation. So it
> takes about twice as long. (You can make it faster
> by storing some intermediate results, so it really
> isn't much slower at all.)
>
> You can fix your formulation by making the SHA
> evaluations "position dependent", say
> H(x) = SHA(65535 || SHA(... SHA(1 || SHA(0 || x))...)
> where "||" is concatenenation. Now the slide
> trick doesn't work any more.

Thanks Greg. I'm happy I posed the question.

I would prefer to XOR with a position dependant
constant since this keeps the data size equal.
As I understand this would also work.

Are there any better methods to make pre-imaging
a 32-bit hash more difficult?

--
GCP

Ertugrul Soeylemez

ongelezen,

5 apr 2007, 02:05:0205-04-2007

aan

g...@network.ucsd.edu (Greg Rose) (07-04-04 12:36:20):

> > Further assuming that SHA-256 is still quite perfect and this loss
> > is uniformly distributed, then you'll lose about 1.5 / 256 / (256 /
> > output size) bits on entropy on every iteration over truncated
> > outputs.
> >
> > For 65536 iterations, you'll lose all your entropy.
>
> No, sorry, that's not true (otherwise finding collisions would be very
> easy). The problem is that, for the second round, there are less than
> 2^256 possible inputs,

Yes, this is what I have meant. You cannot assume SHA-256 to be a
bijection for GF(2^256) -> GF(2^256).

> and for the third round still less, and so on. So for each repeated
> evaluation, you lose less entropy, until (assuming SHA-256 behaves
> well) at around 2^128 possible outputs, all of them hash to unique
> values, and then you stick there (which is expected to happen after
> about 2^128 rounds).

Not necessarily. I would even say that only a fixed, small number of
hash values won't be generated ever. You're assuming that the output
set is depending on the input value. This is not the case.

Ertugrul Soeylemez

ongelezen,

5 apr 2007, 02:14:2905-04-2007

aan

"Joseph Ashwood" <ash...@msn.com> (07-04-04 05:12:46):

> > > As a side note, do not just truncate the hash value. Rather split
> > > it into words of equal length and XOR them together.
> >
> > Again, can I ask why?
>
> As far as is truly known it is pure superstition, but if you're adding
> instructions, it's another instruction.

Yes, exactly. The problem with all current hash functions is that they
are esoteric beasts. Do we know that the entropy of all bits of the
hash value are distributed equally?

If we just truncate the value, then we need to care about this. That's
why I suggested combining all bits of the hash value. This won't hurt,
since the OP wants to make calculating hashes cost more CPU time anyway.

Ertugrul Soeylemez

ongelezen,

5 apr 2007, 05:22:4505-04-2007

aan

Gian-Carlo Pascutto <nat...@hotmail.com> (07-04-04 18:02:08):

> > Ertugrul was saying that is 65536 takes 2730 days, to reach 5460
> > days would require 131072. 131072 operations of course takes twice
> > as long as 65536 for the legitimate user as well, resulting in
> > everyone paying the same 2x penalty. Except that as you observe
> > next, because the attacker can scale, and the natural progression is
> > towards multiple cores the attacker may actually pay less of a
> > natural penalty. Although the per core penalty remains the same,
> > because the legitimate user is wasting cores, and the attacker
> > isn't, the attacker pays less of a penalty.
>
> Valid point.

The point is that you are trying to strengthen a system, which is simply
not secure, whatever you do. If you want security, then you will need
to pay its price.

> The original idea was to reverse the situation like you describe. The
> receiver only has to perform 1 evaluation, so it being 65535 times
> slower is hardly noticeable.

This is similar to Merkle's Puzzle. See, you have to consider the worst
case, not the average case. And the worst case is a user, who already
has an FPGA.

> For an attacker that has an attack that takes a lot of work, getting
> 65535 times more work is a bigger problem.

For the average attacker, yes. But from 1000 users each of them is a
potential attacker. A few of them have very fast machines, and a few
more have friends who have even faster machines or a cluster available.
Very few may even have specialized hardware. You just can't trust your
own system to withstand attacks. If for some reason someone really
wants to attack your system, he will succeed.