Info Node: (nettle.info)Randomness

nettle.info: Randomness
Reference
ASCII encoding
Public-key algorithms
Back to Software Index
7.8 Randomness
==============
A crucial ingredient in many cryptographic contexts is randomness: Let
‘p’ be a random prime, choose a random initialization vector ‘iv’, a
random key ‘k’ and a random exponent ‘e’, etc. In the theories, it is
assumed that you have plenty of randomness around. If this assumption
is not true in practice, systems that are otherwise perfectly secure,
can be broken. Randomness has often turned out to be the weakest link
in the chain.
In non-cryptographic applications, such as games as well as
scientific simulation, a good randomness generator usually means a
generator that has good statistical properties, and is seeded by some
simple function of things like the current time, process id, and host
name.
However, such a generator is inadequate for cryptography, for at
least two reasons:
• It’s too easy for an attacker to guess the initial seed. Even if
it will take some 2^32 tries before he guesses right, that’s far
too easy. For example, if the process id is 16 bits, the
resolution of “current time” is one second, and the attacker knows
what day the generator was seeded, there are only about 2^32
possibilities to try if all possible values for the process id and
time-of-day are tried.
• The generator output reveals too much. By observing only a small
segment of the generator’s output, its internal state can be
recovered, and from there, all previous output and all future
output can be computed by the attacker.
A randomness generator that is used for cryptographic purposes must
have better properties. Let’s first look at the seeding, as the issues
here are mostly independent of the rest of the generator. The initial
state of the generator (its seed) must be unguessable by the attacker.
So what’s unguessable? It depends on what the attacker already knows.
The concept used in information theory to reason about such things is
called “entropy”, or “conditional entropy” (not to be confused with the
thermodynamic concept with the same name). A reasonable requirement is
that the seed contains a conditional entropy of at least some 80-100
bits. This property can be explained as follows: Allow the attacker to
ask ‘n’ yes-no-questions, of his own choice, about the seed. If the
attacker, using this question-and-answer session, as well as any other
information he knows about the seeding process, still can’t guess the
seed correctly, then the conditional entropy is more than ‘n’ bits.
Let’s look at an example. Say information about timing of received
network packets is used in the seeding process. If there is some random
network traffic going on, this will contribute some bits of entropy or
“unguessability” to the seed. However, if the attacker can listen in to
the local network, or if all but a small number of the packets were
transmitted by machines that the attacker can monitor, this additional
information makes the seed easier for the attacker to figure out. Even
if the information is exactly the same, the conditional entropy, or
unguessability, is smaller for an attacker that knows some of it already
before the hypothetical question-and-answer session.
Seeding of good generators is usually based on several sources. The
key point here is that the amount of unguessability that each source
contributes, depends on who the attacker is. Some sources that have
been used are:
High resolution timing of i/o activities
Such as completed blocks from spinning hard disks, network packets,
etc. Getting access to such information is quite system dependent,
and not all systems include suitable hardware. If available, it’s
one of the better randomness source one can find in a digital,
mostly predictable, computer.
User activity
Timing and contents of user interaction events is another popular
source that is available for interactive programs (even if I
suspect that it is sometimes used in order to make the user feel
good, not because the quality of the input is needed or used
properly). Obviously, not available when a machine is unattended.
Also beware of networks: User interaction that happens across a
long serial cable, TELNET session, or even SSH session may be
visible to an attacker, in full or partially.
Audio input
Any room, or even a microphone input that’s left unconnected, is a
source of some random background noise, which can be fed into the
seeding process.
Specialized hardware
Hardware devices with the sole purpose of generating random data
have been designed. They range from radioactive samples with an
attached Geiger counter, to amplification of the inherent noise in
electronic components such as diodes and resistors, to
low-frequency sampling of chaotic systems. Hashing successive
images of a Lava lamp is a spectacular example of the latter type.
Secret information
Secret information, such as user passwords or keys, or private
files stored on disk, can provide some unguessability. A problem
is that if the information is revealed at a later time, the
unguessability vanishes. Another problem is that this kind of
information tends to be fairly constant, so if you rely on it and
seed your generator regularly, you risk constructing almost similar
seeds or even constructing the same seed more than once.
For all practical sources, it’s difficult but important to provide a
reliable lower bound on the amount of unguessability that it provides.
Two important points are to make sure that the attacker can’t observe
your sources (so if you like the Lava lamp idea, remember that you have
to get your own lamp, and not put it by a window or anywhere else where
strangers can see it), and that hardware failures are detected. What if
the bulb in the Lava lamp, which you keep locked into a cupboard
following the above advice, breaks after a few months?
So let’s assume that we have been able to find an unguessable seed,
which contains at least 80 bits of conditional entropy, relative to all
attackers that we care about (typically, we must at the very least
assume that no attacker has root privileges on our machine).
How do we generate output from this seed, and how much can we get?
Some generators (notably the Linux ‘/dev/random’ generator) tries to
estimate available entropy and restrict the amount of output. The goal
is that if you read 128 bits from ‘/dev/random’, you should get 128
“truly random” bits. This is a property that is useful in some
specialized circumstances, for instance when generating key material for
a one time pad, or when working with unconditional blinding, but in most
cases, it doesn’t matter much. For most application, there’s no limit
on the amount of useful “random” data that we can generate from a small
seed; what matters is that the seed is unguessable and that the
generator has good cryptographic properties.
At the heart of all generators lies its internal state. Future
output is determined by the internal state alone. Let’s call it the
generator’s key. The key is initialized from the unguessable seed.
Important properties of a generator are:
“Key-hiding”
An attacker observing the output should not be able to recover the
generator’s key.
“Independence of outputs”
Observing some of the output should not help the attacker to guess
previous or future output.
“Forward secrecy”
Even if an attacker compromises the generator’s key, he should not
be able to guess the generator output _before_ the key compromise.
“Recovery from key compromise”
If an attacker compromises the generator’s key, he can compute
_all_ future output. This is inevitable if the generator is seeded
only once, at startup. However, the generator can provide a
reseeding mechanism, to achieve recovery from key compromise. More
precisely: If the attacker compromises the key at a particular time
‘t_1’, there is another later time ‘t_2’, such that if the attacker
observes all output generated between ‘t_1’ and ‘t_2’, he still
can’t guess what output is generated after ‘t_2’.
Nettle includes one randomness generator that is believed to have all
the above properties, and two simpler ones.
ARCFOUR, like any stream cipher, can be used as a randomness
generator. Its output should be of reasonable quality, if the seed is
hashed properly before it is used with ‘arcfour_set_key’. There’s no
single natural way to reseed it, but if you need reseeding, you should
be using Yarrow instead.
The “lagged Fibonacci” generator in ‘<nettle/knuth-lfib.h>’ is a fast
generator with good statistical properties, but is *not* for
cryptographic use, and therefore not documented here. It is included
mostly because the Nettle test suite needs to generate some test data
from a small seed.
The recommended generator to use is Yarrow, described below.
7.8.1 Yarrow
------------
Yarrow is a family of pseudo-randomness generators, designed for
cryptographic use, by John Kelsey, Bruce Schneier and Niels Ferguson.
Yarrow-160 is described in a paper at
<http://www.counterpane.com/yarrow.html>, and it uses SHA1 and
triple-DES, and has a 160-bit internal state. Nettle implements
Yarrow-256, which is similar, but uses SHA256 and AES to get an internal
state of 256 bits.
Yarrow was an almost finished project, the paper mentioned above is
the closest thing to a specification for it, but some smaller details
are left out. There is no official reference implementation or test
cases. This section includes an overview of Yarrow, but for the details
of Yarrow-256, as implemented by Nettle, you have to consult the source
code. Maybe a complete specification can be written later.
Yarrow can use many sources (at least two are needed for proper
reseeding), and two randomness “pools”, referred to as the “slow pool”
and the “fast pool”. Input from the sources is fed alternatingly into
the two pools. When one of the sources has contributed 100 bits of
entropy to the fast pool, a “fast reseed” happens and the fast pool is
mixed into the internal state. When at least two of the sources have
contributed at least 160 bits each to the slow pool, a “slow reseed”
takes place. The contents of both pools are mixed into the internal
state. These procedures should ensure that the generator will
eventually recover after a key compromise.
The output is generated by using AES to encrypt a counter, using the
generator’s current key. After each request for output, another 256
bits are generated which replace the key. This ensures forward secrecy.
Yarrow can also use a “seed file” to save state across restarts.
Yarrow is seeded by either feeding it the contents of the previous seed
file, or feeding it input from its sources until a slow reseed happens.
Nettle defines Yarrow-256 in ‘<nettle/yarrow.h>’.
-- Context struct: struct yarrow256_ctx
-- Context struct: struct yarrow_source
Information about a single source.
-- Constant: YARROW256_SEED_FILE_SIZE
Recommended size of the Yarrow-256 seed file.
-- Function: void yarrow256_init (struct yarrow256_ctx *CTX, unsigned
NSOURCES, struct yarrow_source *SOURCES)
Initializes the yarrow context, and its NSOURCES sources. It’s
possible to call it with NSOURCES=0 and SOURCES=NULL, if you don’t
need the update features.
-- Function: void yarrow256_seed (struct yarrow256_ctx *CTX, size_t
LENGTH, uint8_t *SEED_FILE)
Seeds Yarrow-256 from a previous seed file. LENGTH should be at
least ‘YARROW256_SEED_FILE_SIZE’, but it can be larger.
The generator will trust you that the SEED_FILE data really is
unguessable. After calling this function, you _must_ overwrite the
old seed file with newly generated data from ‘yarrow256_random’.
If it’s possible for several processes to read the seed file at
about the same time, access must be coordinated using some locking
mechanism.
-- Function: int yarrow256_update (struct yarrow256_ctx *CTX, unsigned
SOURCE, unsigned ENTROPY, size_t LENGTH, const uint8_t *DATA)
Updates the generator with data from source SOURCE (an index that
must be smaller than the number of sources). ENTROPY is your
estimated lower bound for the entropy in the data, measured in
bits. Calling update with zero ENTROPY is always safe, no matter
if the data is random or not.
Returns 1 if a reseed happened, in which case an application using
a seed file may want to generate new seed data with
‘yarrow256_random’ and overwrite the seed file. Otherwise, the
function returns 0.
-- Function: void yarrow256_random (struct yarrow256_ctx *CTX, size_t
LENGTH, uint8_t *DST)
Generates LENGTH octets of output. The generator must be seeded
before you call this function.
If you don’t need forward secrecy, e.g. if you need non-secret
randomness for initialization vectors or padding, you can gain some
efficiency by buffering, calling this function for reasonably large
blocks of data, say 100-1000 octets at a time.
-- Function: int yarrow256_is_seeded (struct yarrow256_ctx *CTX)
Returns 1 if the generator is seeded and ready to generate output,
otherwise 0.
-- Function: unsigned yarrow256_needed_sources (struct yarrow256_ctx
*CTX)
Returns the number of sources that must reach the threshold before
a slow reseed will happen. Useful primarily when the generator is
unseeded.
-- Function: void yarrow256_fast_reseed (struct yarrow256_ctx *CTX)
-- Function: void yarrow256_slow_reseed (struct yarrow256_ctx *CTX)
Causes a fast or slow reseed to take place immediately, regardless
of the current entropy estimates of the two pools. Use with care.
Nettle includes an entropy estimator for one kind of input source:
User keyboard input.
-- Context struct: struct yarrow_key_event_ctx
Information about recent key events.
-- Function: void yarrow_key_event_init (struct yarrow_key_event_ctx
*CTX)
Initializes the context.
-- Function: unsigned yarrow_key_event_estimate (struct
yarrow_key_event_ctx *CTX, unsigned KEY, unsigned TIME)
KEY is the id of the key (ASCII value, hardware key code, X keysym,
..., it doesn’t matter), and TIME is the timestamp of the event.
The time must be given in units matching the resolution by which
you read the clock. If you read the clock with microsecond
precision, TIME should be provided in units of microseconds. But
if you use ‘gettimeofday’ on a typical Unix system where the clock
ticks 10 or so microseconds at a time, TIME should be given in
units of 10 microseconds.
Returns an entropy estimate, in bits, suitable for calling
‘yarrow256_update’. Usually, 0, 1 or 2 bits.
automatically generated by info2www version 1.2