The conventional wisdom is that hashing more entropy sources can't hurt:
if H is any modern cryptographic hash function then H(x,y,z) is at least
as good a random number as H(x,y), no matter how awful z is. So we pile
one source on top of another, hashing them all together and hoping that
at least one of them is good.
apply. For example, an attacker can't exert any serious control over the
content of my keystrokes while I'm logged in; I don't see how hashing
this particular content into my laptop's entropy pool can allow any
attacks. But I also don't see how it helps.)
What about the issues with VMs that are cloned or rolled back while they're running? It's bad enough if the random pool is reused for a short while, but still much better than if it was reused forever.
...
But what if z comes from a malicious source that can snoop on x and y?
For example, imagine a malicious "secure randomness" USB device that's
actually spying on all your other randomness sources through various
side channels, or---worse---imagine RDRAND microcode that's looking at
the randomness pool that it's about to be hashed into. I should note
that none of the attacks described below rely on tampering with x or y,
or otherwise modifying data outside the malicious entropy source; you
can't stop these attacks by double-checking the integrity of data.
...
---Dan
A fully transparent RNG/PRNG implementation requires access to the
microcode/firmware or source code, respectively.
Most of the current generation hardware-based RNGs out there are black
boxes, and additionally vulnerable to other more esoteric attacks such
as side channel power rail analysis for divining key materials. At
least with a kernel-based PRNG, you can analyze the source code to audit
entropy gathering sources and mixing algorithms; not so with discrete
hardware RNG implementations.
Most of these issues identified with RdRand and other suspect
vendor-specific RNGs could have been alleviated many moons ago with
statistical analysis and other more advanced methods of entropy
examination that have been around for decades, but apparently everyone
is too busy drinking the Torvalds et al kool aid to pay attention to
these types of security certification and accreditation methods. The
mere fact that the entire Linux community accepted as face value Herr
Torvalds' squawk of "Short answer: we actually know what we are doing.
You don't.” speaks volumes about the current state of security with the
Linux kernel.
At this point I would not trust any hardware-based RNG implementations
out there, the benefits of hardware acceleration for random number
generation does not outweigh the potential for covert backdoors that are
easily thwarted with open source entropy gathering and mixing
algorithms. In fact, I can't think of any mainstream reason for a
hardware accelerated RNG implementation unless the application is a high
bandwidth SSL webserver or a multiple gigabit VPN concentrator.
Twitter's recent forward secrecy announcement pretty much sums up that
sentiment in that there was negligent CPU computational cost associated
with their PFS implementation, and with no mention of any
hardware-accelerated crypto requirement that I can find citation for.
D. J. Bernstein has proposed an attack on RNG schemes that use a hash algorithm to combine output from several entropy sources. (2014.02.05: Entropy Attacks!, on The cr.yp.to blog) I believe his attack can be thwarted by using a key-stretching hash to combine the outputs.
Bernstein showed that, under certain circumstances, one specially crafted rouge entropy source can control, to a limited extent, the final output of the combining RNG. To successfully mount the Bernstein attack, the rogue entropy source must be the last entropy source polled and must know the system under attack’s combining scheme and all outputs the system obtained from the other entropy sources. The rogue RNG then internally repeats the combining hash function many times until it finds a random bit stream to output that, when hashed with the previous entropy outputs, produces a final RNG output value with the desired characteristics.
If the combining hash is a key stretcher with a work parameter set high enough, it will not be possible for the rouge RNG to perform the many tries needed to find an acceptable output without imposing a large, easily detected delay. For simple hashes, especial ones crafted for fast hardware performance like the SHA families, it may be possible for the rouge RNG to make the necessary trials fast enough to avoid notice. But if a key-stretching hash is employed that uses enough CPU and memory resources, such as scrypt or HEKS, it will be infeasible, both physically and economically, to hide enough computation power inside the rouge RNG to do the repeated hash trials many times faster than system under attack, even if the rouge RNG is part of the system’s CPU chip.
Assuming the system under attack has a trusted timer or real-time clock, the system software can set the work level parameter of the combining hash to be on the order of the maximum time that should be needed to interrogate the final entropy source (the potential rogue) after output has been received from all other entropy sources. If the measured time exceeds a reasonable margin above what is expected, the system should fail safe by refusing to accept the RNG outputs. Alternatively, the system can measure the time interval between obtaining the first and last RNG’s data and set its key stretching work factor based on this time interval.
In particular, Intel’s RDRAND is implemented as a CPU instruction, so almost no time should elapse between the penultimate entropy source's data being obtained and the needed RDRAND instructions being completed. Thus a fairly small key stretching work parameter, even a few milliseconds, should suffice to detect a CPU with an RDRAND that attempts to implement Bernstein's attack.
Arguably a CPU chip with a cooked RNG instruction could suspend its timers while possible RNG outputs are being tested, though it would have to know when the critical output was being requested, otherwise the timer interference might be detected. (At some point we must recognize that an all-knowing CPU with lots of covert firmware designed around our software can defeat any security scheme.) Other timing sources are possible. Common, inexpensive external real time clock ICs, such as the DS1302, provide timing to one second resolution. An available GPIO pin could be used to charge a small capacitor with a resistor in parallel and then measure its voltage decay for a rough time check.
While placing a key stretcher in the RNG chain may seem overkill to just thwart the Bernstein entropy combiner attack, an attack which demands somewhat far fetched preconditions, using a key stretching hash can also mitigate the type of attack proposed by Georg T. Becker. et. al. (Stealthy Dopant-Level Hardware Trojans http://people.umass.edu/gbecker/BeckerChes13.pdf ) An attacker using the Becker technique will reduce the effective state space of the RNG being cooked. The cooked state space must be small enough to allow a brute force search of possible RNG states, but not so small as to risk detection. Applying a high work factor key stretcher to the output of a Beckerized RNG would make the state space search far more expensive and possibly force the attacker to shrink the effective state space enough to allow detection.
The key stretching defense is particularly well suited to the difficult case of a black box device needing random data at first start up, when, typically, its secret keys are generated. A tens of seconds delay during first boot would seem a reasonable trade off for enhanced security. The Bernstein attack would then take several minutes, a delay that would be easy to notice. A similar argument applies to systems that generate one time a high quality seed for a deterministic RNG which will then be used to meet all future randomness requirements.
The conventional wisdom is that hashing more entropy sources can't hurt:
if H is any modern cryptographic hash function then H(x,y,z) is at least
as good a random number as H(x,y), no matter how awful z is. So we pile
one source on top of another, hashing them all together and hoping that
at least one of them is good.
But what if z comes from a malicious source that can snoop on x and y?
For example, imagine a malicious "secure randomness" USB device that's
actually spying on all your other randomness sources through various
side channels, or---worse---imagine RDRAND microcode that's looking at
the randomness pool that it's about to be hashed into. I should note
that none of the attacks described below rely on tampering with x or y,
or otherwise modifying data outside the malicious entropy source; you
can't stop these attacks by double-checking the integrity of data.
...
D. J. Bernstein has proposed an attack on RNG schemes that use a hash algorithm to combine output from several entropy sources. (2014.02.05: Entropy Attacks!, on The cr.yp.to blog) I believe his attack can be thwarted by using a key-stretching hash to combine the outputs.
Bernstein showed that, under certain circumstances, one specially crafted rouge entropy source can control, to a limited extent, the final output of the combining RNG. To successfully mount the Bernstein attack, the rogue entropy source must be the last entropy source polled and must know the system under attack’s combining scheme and all outputs the system obtained from the other entropy sources. The rogue RNG then internally repeats the combining hash function many times until it finds a random bit stream to output that, when hashed with the previous entropy outputs, produces a final RNG output value with the desired characteristics.
If the combining hash is a key stretcher with a work parameter set high enough, it will not be possible for the rouge RNG to perform the many tries needed to find an acceptable output without imposing a large, easily detected delay. For simple hashes, especial ones crafted for fast hardware performance like the SHA families, it may be possible for the rouge RNG to make the necessary trials fast enough to avoid notice. But if a key-stretching hash is employed that uses enough CPU and memory resources, such as scrypt or HEKS, it will be infeasible, both physically and economically, to hide enough computation power inside the rouge RNG to do the repeated hash trials many times faster than system under attack, even if the rouge RNG is part of the system’s CPU chip.
While placing a key stretcher in the RNG chain may seem overkill to just thwart the Bernstein entropy combiner attack, an attack which demands somewhat far fetched preconditions, using a key stretching hash can also mitigate the type of attack proposed by Georg T. Becker. et. al. (Stealthy Dopant-Level Hardware Trojans http://people.umass.edu/gbecker/BeckerChes13.pdf ) An attacker using the Becker technique will reduce the effective state space of the RNG being cooked. The cooked state space must be small enough to allow a brute force search of possible RNG states, but not so small as to risk detection. Applying a high work factor key stretcher to the output of a Beckerized RNG would make the state space search far more expensive and possibly force the attacker to shrink the effective state space enough to allow detection.
[I'm responding to a somewhat old message, because I think this is an interesting idea worth thinking more about]
D. J. Bernstein has proposed an attack..,
...While placing a key stretcher in the RNG chain may seem overkill to just thwart the Bernstein entropy combiner attack, an attack which demands somewhat far fetched preconditions, using a key stretching hash can also mitigate the type of attack proposed by Georg T. Becker. et. al. (Stealthy Dopant-Level Hardware Trojans http://people.umass.edu/gbecker/BeckerChes13.pdf ) An attacker using the Becker technique will reduce the effective state space of the RNG being cooked. The cooked state space must be small enough to allow a brute force search of possible RNG states, but not so small as to risk detection. Applying a high work factor key stretcher to the output of a Beckerized RNG would make the state space search far more expensive and possibly force the attacker to shrink the effective state space enough to allow detection.
Yes, this is a variant of the entropy guessing attack.The obvious problem with making initializing the RNG computationally expensive is that people really don't like waiting for their RNG to be ready to generate random bits. (Note the pressure *not* to make /dev/urandom block when it doesn't have enough input entropy. For that matter, note the use of /dev/urandom for crypto products and libraries that should know better.)I wonder if you might even make some RNGs worse, in practice, by doing this. If my system has to have a working RNG ten seconds after startup, I can either:a. Collect entropy for 10 seconds and seed my RNG.b. Collect entropy for 10-T seconds and then run my key stretching algorithm for T seconds.Suppose we are getting 8 bits of entropy per second, and I set T=5 (so I run my key stretching function for 5 seconds). I now am seeding with 40 bits of entropy, and running for five seconds maybe I'm putting another 25 bits of security into the system, but I am losing more by not incorporating the other 5 seconds' worth of entropy in.
We could incorprate late-arriving entropy into the RNG at the end of the key stretching algorithm, but that kills its value for blocking the attack you were describing above.