Ever since the original Wang attacks on MD5 in 2005 it's been clear
that certificates were the most attractive target. Today, Sotirov,
Stevens, Appelbaum, Lenstra, Molnar, Osvik, and de Weger report
(
slides
writeup)
on an attack against a real CA, in this case RapidSSL.
Background
In order to
understand what's going on, we first need to recall some basic facts
about how certificates work A certificate is a digitally signed assertion of the
binding between a name and a public key.
The data to be signed is as follows (I'm simplifying a bit)
version | The version number (2) |
serialNumber | The unique certificate serial number |
issuer | The name of the CA issuing the certificate |
validity | The time period when the certificate is valid |
subject | The name of the entity to which the certificate was issued. |
subjectPublicKeyInfo | The entity's public key. |
extensions | Arbitrary extensions |
In order to make a certificate, this data gets serialized using an
annoying encoding and then the entire mess is hashed and then the
resulting hash is digitally signed by the CA. The problem we have here
is that the hash, in this case MD5, is weak. More precisely, it's
possible to generate a collision: two inputs that hash to the
same output. (See here
for more background on attacks on hash functions.) We've known for
years how to exploit this kind of attack. The basic idea is that the
attacker prepares two documents, one "good" and one "bad" that hash to
the same value. He then gets the signer to sign the "good" variant and
then cuts and pastes the signature onto the "bad" variant, thus
producing a valid signature on the bad document.
So, the way you
would use this to attack certificates is that you would generate
a "good" certificate signing request that would result in
a certificate that had the same hash as a "bad" certificate you had generated locally.
You get the CA to sign the request
and then substitute the bad certificate.
Until now there were
two major obstacles to using this technique to attack certificates:
- It wasn't clear that the
serialNumber
field was
predictable.
- The techniques for generating collisions weren't very good: they
weren't that controllable (they generated a lot of random-appearing
data) and were slow; or rather there were techniques for generating
fast collisions but they weren't at all controllable.
The relevance of the serialNumber
is this: unlike the
name and the public key, the serialNumber
and validity
are generated by the CA. So, you need to know in advance what they
will be in order to generate the appropriate colliding "bad" certificate.
The validity is typically just generated as something like a year
or two from the time of issue, so it's relatively predictable.
The CA has a lot of freedom in how to generate the serial number.
If it's truly a sequence number, it's quite predictable. However,
if it's randomly generated, then it can be made arbitrarily unpredictable,
which effectively blocks this kind of collision attack.
When MD5 collisions were first discovered, the two standard
recommendations were (1) stop using MD5 and (2) generate random
serial numbers.
This Attack
Which brings us to this new work, which involves two main contributions. First,
the authors improved their collision finding techniques so they need
a lot less random-appearing data. The second is that they found a CA
which still used MD5 and doesn't randomize the sequence number.
Taken together, this allowed them to convince the CA to sign a
certificate which was in itself valid but which collided with a
certificate that the CA would never have signed, in this case a
certificate for a new, subordinate CA. (It could just as well have
been a certificate for a specific target web site, but that's less
flashy than a CA certificate.) Once in possession of this new CA
certificate, it's possible for the authors to sign arbitrary new
certificates which will be trusted by anyone who trusted the
original CA [subject to some technical limitations which I won't go
into here.] Effectively, the authors have made themselves a CA.
There are some interesting technical hacks needed to make this work:
although the serial number is somewhat predictable, it's not completely
so, and in order to mount the attack they had to guess the serial
number in advance. This guess isn't totally accurate, but they
were then able to issue their own CSRs to increment the serial
number to where they needed it to be.
Impact
The impact of this is that the authors could in principle mount
man-in-the-middle or other impersonation attacks on any Web server
provided that the client trusted this particular CA (most do).
The existence of this certificate doesn't allow anybody else to mount
impersonation attacks, since ordinary attackers won't have the
corresponding private key (unless they break into the authors
machine and recover it, of course).
The authors have taken some steps to make the particular
certificate they issued less useful for this purpose. In particular,
it has a time way in the past, so unless your clock is way
off, you should notice this attack. That's not to say that
there's no risk here, since you might not notice the expiration
date issue.
Of course, it's possible that an attacker could independently use
the same technique to acquire their own CA certificate.
In fact, we don't know for a fact that nobody already has. The
only real obstacle is that the crypto needed here is fairly involved
and the experts on it are mostly respected academics, many of
whom are on this paper. So, the sooner that CAs adopt the
mitigations mentioned above, the better.
I should mention that this isn't the only way to get a bogus
certificate: many CAs don't do a particularly good job of
user verification in any case (I'll be posting about one
particular exceptional such case shortly). In particular,
it's common to use "email confirmation" for identity verification,
where the CA sends email to the administrator of the relevant
machine to verify the certificate request. There are probably
a number of cases in which it's easier to attack that than to
build up a whole certificate collision infrastructure.
Containment
There are really two questions about how to contain this
vulnerability:
- What should we do about this specific certificate?
- What should be done about the class of vulnerability?
The two basic options for this certificate are to ignore it
(assume we trust the researchers, especially since the certificate
is expired) or to blacklist it. The way that the blacklist would
work is that the browser manufacturers would just issue a security
update with a patch to the certificate validation code telling it
not to trust this specific certificate, just as they would
patch any other security vulnerability. For perspective, we
can think of this as a vulnerability with an exploit that is known
only to the researchers—even though we have the CA cert,
we can't use it productively, and it's not likely to be reproducible.
If I were in charge of a browser, which I'm not,
I would probably issue a patch with a blacklist for this certificate.
Others opinions may vary; as far as I know, the browser manufacturers
didn't issue mandatory security updates blacklisting all the Debian
OpenSSL keys, so that may be a cue to their general attitude.
The second question is what to do about this class of vulnerability.
Because this attack only can be mounted against a live CA, not against
an old certificate, it's very important that the affected CAs either stop
using MD5, use randomized serial numbers, or both. Presumably, the
news coverage will act as an inducement for them to do so.
I've also heard suggestions that the browser manufacturers should disable
MD5. There are probably still enough MD5-using servers out there that
this would be problematic, though it's something to consider for the
future.
Bottom Line
As usual, don't panic. In its current state, this is more of a demonstration
of a hole than a serious hole. Countermeasures are readily available to
the CAs and if the remaining CAs fix their practices
fast enough, then it's unlikely that there will be any more bad certificates
issued (it takes some time to spin up your infrastructure for this attack).
Even if one or two such certificates are issued—even to bad guys—
it's not the end of the world. Once they're detected they can be blacklisted.
This takes a long time with the current patching rate, but it's not conceptually
any worse than a remotely exploitable problem with your browser, or a bug
in certificate validation logic, both of which have been known to happen.
That said, it is very important that the CAs do fix their practices, since
this has the potential to become serious if the capability to mount the attack
becomes widespread and convenient.
UPDATE: Some minor corrections due to Hovav Shacham (only controllable
MD5 collisions were slow)