On April 19, 2017, Mozilla Foundation published the Security Advisory 2017-10 outlining several recently fixed security vulnerabilities. One of these vulnerabilities, tracked as CVE-2017-5462, affects the Pseudo-Random Number Generator (PRNG) within the Network Security Services (NSS) library prior to version 3.29.5 and Firefox prior to version 53.
This post describes the bug and how it was discovered.
Inside the NSS PRNG
NSS uses Hash_DRBG
as PRNG, which is one of several PRNG schemes defined in the NIST Special Publication 800-90.
Like most widely used PRNGs the Hash_DRBG
is a Deterministic Random Bit Generator (DRBG). (Even though this term is usually only used for NIST PRNGs.)
While the standard contains all the details, the relevant features can be summarised as follows.
The state of Hash_DRBG
is composed of three values:
- A 55-byte integer state variable
V
, which is updated with each request of new bits - A 55-byte integer constant
C
that depends on the seed and is updated when re-seeding the PRNG. - A counter
c
tracking when the next re-seeding is needed.
To generate random bits, Hash_DRBG
concatenates H(V) || H(V+1) || H(V+2) || ...
until enough bits are generated.
H
denotes a cryptographic hash function here.
NSS uses the SHA-256 hash function for H
with a digest length of 32 bytes.
After generating new bits the state variable V
is updated according to the rule V
c+1
= V + H(0x03 || V
c
) + C + c
and counter c
is incremented by one.
Addition is performed modulo 2
440
= 2
8*55
to fit it in the 55 bytes of V
.
The PRNG implementation can be found in the file drbg.c within the NSS codebase.
CVE-2017-5462
The issue identified in CVE-2017-5462 is in the code implementing the addition.
/*
* build some fast inline functions for adding.
*/
#define PRNG_ADD_CARRY_ONLY(dest, start, carry) \
{ \
int k1; \
for (k1 = start; carry && k1 >= 0; k1--) { \
carry = !(++dest[k1]); \
} \
}
/*
* NOTE: dest must be an array for the following to work.
*/
#define PRNG_ADD_BITS(dest, dest_len, add, len, carry) \
carry = 0; \
PORT_Assert((dest_len) >= (len)); \
{ \
int k1, k2; \
for (k1 = dest_len - 1, k2 = len - 1; k2 >= 0; --k1, --k2) { \
carry += dest[k1] + add[k2]; \
dest[k1] = (PRUint8)carry; \
carry >>= 8; \
} \
}
#define PRNG_ADD_BITS_AND_CARRY(dest, dest_len, add, len, carry) \
PRNG_ADD_BITS(dest, dest_len, add, len, carry) \
PRNG_ADD_CARRY_ONLY(dest, dest_len - len, carry)
When the PRNG performs addition to update V
it uses the macro PRNG_ADD_BITS_AND_CARRY
, which first delegates to the macro PRNG_ADD_BITS
to add the two summands without considering the final carry and then the macro PRNG_ADD_CARRY_ONLY
to add the carry.
In this addition code it is clear that the carry should be added at the position preceding the original most-significant-byte of the shorter of the two summands.
This fact was supposed to be represented by the index dest_len-len
supplied as parameter to PRNG_ADD_CARRY_ONLY
.
Note that numbers are represented as sequences of bytes with byte number zero being the most-significant byte.
The essence of the bug is that dest_len-len
does not point to the correct position of the carry, which should have been added at position dest_len-len-1
instead.
Example
We note that 3 * 0x40 = 0xc0
under both proper and broken addition (the latter due to absence of carrying in this example).
However, 3 * 0x95 = 0x01bf
under proper unbounded addition resp. 0xbf
under proper modulo addition.
Yet, the result under broken addition is 0x01 + 0xbf = 0xc0
.
Finding the Bug with Testing
The easiest way to find most types of bugs in PRNGs following NIST SP 800-90 is by testing the implementation with the official test vectors (seeds and outputs) provided in the standard. I found the bug after implementing these test vectors. Functional unit testing would probably have caught this particular bug as well. For PRNGs not following the NIST standard, defining corresponding test vectors is a good idea to avoid regressions during maintenance.
Statistical tests such as NIST SP 800-22 or DIEHARDEST are not very useful for testing cryptographic PRNGs. They can be used for testing the entropy pool that is fed into the PRNG. But such tests will not fail as long as the internal state does not stay constant and output passes through a cryptographic primitive (such as SHA-256) before reaching the consumer.
Formal Analysis
Independ of my investigation, Vladimir Klebanov found this bug using Entroposcope, a static analysis tool created for finding implementation bugs in pseudo-random number generators.
Entropy loss occurs when the number of possible output streams is less than the number of possible seeds. This is equivalent to the case when two different seeds produce the same output stream (also called a collision). Entroposcope is built on top of the bounded model checker CBMC, which in turn transforms the problem into a challenge for a SAT solver.
Considering the Hash_DRBG
, the question whether V
c+1
= V + H(0x03 || V
c
) + C + c
produces a collision and the DRBG loses entropy or not boils down to whether distinct 55-byte values x
1
, x
2
exist, such that x
1
+ H(x
1
) = x
2
+ H(x
2
)
.
Now, it is clear that this question cannot be answered without either knowing the output of H
for each 55-byte input (which is infeasible) or some (unknown to us and certainly also to the tool) nontrivial mathematical argument on the nature of H
in this context.
In this regard, the Hash_DRBG
differs from many other PRNGs that employ significantly simpler operations on the output of H
before it makes its way to the PRNG caller.
As a consequence of this design, Entroposcope can not be used to prove absence of entropy loss in the Hash_DRBG
.
Nonetheless, Entroposcope can check collision-freedom of the PRNG under an idealised H
and find bugs in the parts of the implementation that are not H
.
For this purpose we consider an idealised PRNG with H(b||V) = V
and C = V
0
.
This is the same kind of idealisation that helped Vladimir to uncover previously unknown bugs in OpenSSL and GnuPG.
With the idealisation, on the first iteration (c = 0
) updating the state becomes V
1
= 3 * V
0
.
Because 3
and 2
440
are co-prime, V
1
will not produce a collision if implemented properly.
Indeed, given the idealised code, Entroposcope produced a counterexample to entropy preservation with two concrete seeds leading to the same output stream. Tracing these two executions makes it easy to pinpoint the cause of the collision in the addition code.
Thanks to Vladimir for finding this bug and helping to write this post.