blog posts

Random number generators, and their restrictions, in Linux

Definition of random numbers in Linux

Random numbers of Linux are important in computing. TCP/IP sequence numbers, TLS nonces, ASLR offsets, password salts, and DNS source port numbers all rely on random numbers. In cryptography, randomness is found everywhere, from the generation of keys to encryption systems, even the way in which cryptosystems are attacked. Without randomness, all crypto operations would be predictable and hence insecure.

When computer algorithms are fed with the same input they should always give the same output; they are predictable and therefore not a good source of random numbers. A good random numbers generator consists of two parts: a source of entropy and a cryptographic algorithm.

An Origin of entropy (RNG)

As you know Entropy is the measurement of uncertainty or disorder in a system. Good entropy comes from the surrounding environment which is unpredictable and chaotic. You can think of entropy as the amount of surprise found in the result of a randomized process: the higher the entropy, the less the certainty found in the result. Random number generators or RNGS are hardware devices or software programs that take non-deterministic inputs in the form of physical measurements of temperature or phase noise or clock signals etc and generate unpredictable numbers as its output.

A hardware RNG could use hard-to-predict values such as wind speed or atmospheric pressure, or exploit intrinsically random (quantum) processes such as photon transmission/reflection through a semi-transparent mirror. In Linux computers we can use the attached hardware to harvest entropy like movement on the mouse pointer, keys typed on the keyboard, and disk and/or network I/O. Such systems are a good source of entropy, however, they are slow to yield data (for example the CPU jitter generator). Also, they are dependent on external triggers in order to generate random numbers. And are often not reliable when a large number of random numbers are required.

Also, there are algorithms to produce pseudo-random values from within an ideal, deterministic computing environment. However, there is no algorithm to produce unpredictable random numbers without some sort of additional non-deterministic input.

 Algorithm Cryptographic (PRNG)

Pseudorandom number generators, or PRNGs, are systems that are efficient in reliably producing lots of artificial random bits from a few true random bits. For example, an RNG that relies on mouse movements or keyboard keypresses would stop working once the user stops interacting with the mouse or the keyboard. So a PRNG would use these random bits of initial entropy and continue producing random numbers.

PRNGs maintain a large memory buffer called the entropy pool. The bytes received from the entropy sources (RNG) are stored there. Often the PRNG mixes the entropy pool bytes in order to remove statistical biases in the entropy data. Random bits are generated by running a deterministic random bit generator (DRBG) on the entropy pool data bits. This algorithm is deterministic (it always produces the same output given the same input). The trick is to ensure that the DRBG is never fed the same value input twice!

The Function PRNG’s

Most operating systems have built-in crypto PRNGs. Most of them are based software, but some can be pure hardware as well.

In Linux, the device files /dev/random and /dev/random are the userland interfaces to the crypto PRNG which can reliably generate random bits.

The kernel maintains an entropy pool which is used to store random data generated from events like inter-keypress timings, inter-interrupt timings, etc. Randomness from these interfaces is fixed with the entropy pool using a sort of cyclic redundancy check-like function. This is not cryptographically strong but tries to ensure. That any maliciously introduced randomness is eliminated and is also fast enough. The kernel also keeps an estimate of how many bits of randomness has been stored into the random number generator’s internal state via the /proc/sys/kernel/random/entropy_avail file.

When random numbers are desired they are obtained by taking the SHA-1 hash of the contents of the entropy pool. The SHA hash is chosen because it is cryptographically strong: it does not expose the contents of Linux the entropy pool, and it is computationally infeasible to reverse the SHA output to obtain its input. Thus, the confidentiality of the entropy pool is preserved. On each generation of random numbers. The kernel decreases its estimate of true randomness which is contained in the entropy pool.

The kernel provides two character devices

/dev/random and /dev/urandom. The /dev/random the device is suitable for use when very high-quality randomness is desired (for example, for a key generation or one-time pads), as it will only return a maximum of the number of bits of randomness (as estimated by the random number generator) contained in the entropy pool.

The /dev/urandom the device does not have this limit and will return as many bytes as are requested. As more and more random bytes are requested without giving time for the entropy pool to recharge, this will result in random numbers that are “merely” cryptographically strong. For many applications, however, this is acceptable.

The biggest problem is that it is blocking. Once the kernel’s entropy pool is exhausted, reads from /dev/random will pause until sufficient entropy is replenished. Such pauses are typically unacceptable and can constitute a denial-of-service attack against the application or even the system as a whole.

How to Boot randomness

In 2012 security researchers scanned the internet and harvested public keys from TLS certificates and SSH hosts. They found a few systems had identical public keys and in some cases very similar RSA keys with shared prime factors. It was found that many of these systems generated their keys very early after boot. At this point, very little entropy is collected in the entropy pool. Therefore despite having a good PRNG, because the entropy pool is almost identical. The random numbers generated are similar on different systems. In Linux, you can carry the information in the entropy pool across shutdowns and start-ups.

Conclusion

Therefore Random numbers are the lifeline of any cryptographic operation in modern computing. Also It is important for developers to understand what interface to use. And how to handle random numbers correctly in their code. It is also important for users to understand the limitations of such code. This post provides a basic insight into how random number generators actually work in Linux and what are their limitations.