COMSEC: December 2008 Archives


December 30, 2008

As part of America's ongoing effort to ensure that sex offenders can never, ever, reintegrate with society (see also) Georgia is now requiring them to hand over their passwords.
"There's certainly a privacy concern," said Sara Totonchi of the Atlanta-based Southern Center for Human Rights. "This essentially will give law enforcement the ability to read e-mails between family members, between employers."

State Sen. Cecil Staton, who wrote the bill, said the measure is designed to keep the Internet safe for children. Authorities could use the passwords and other information to make sure offenders aren't stalking children online or chatting with them about off-limits topics.

Staton said although the measure may violate the privacy of sex offenders, the need to protect children "outweighs a lot of the rights of these individuals."

"We limit where they can live, we make their information available on the Internet. To some degree, we do invade their privacy," said Staton, a Republican from Macon. "But the feeling is, they have forfeited, to some degree, some privacy rights."

Obviously there are privacy concerns, but that's not the only issue: it potentially exposes those subject to these rules to a whole bunch of financial threats to e-commerce, banking, etc. Even if the requirement is limited to communications systems like IM, email, and Facebook, the ability to receive email is used as a generic authentication mechanism for things like password reset. For instance, e-commerce sites like Amazon or Zappos will email you a copy of your password. One wonders what, if any, controls Georgia intends to use to protect sex offenders from unscrupulous state officials who have access to this information.

Ever since the original Wang attacks on MD5 in 2005 it's been clear that certificates were the most attractive target. Today, Sotirov, Stevens, Appelbaum, Lenstra, Molnar, Osvik, and de Weger report (slides writeup) on an attack against a real CA, in this case RapidSSL.

In order to understand what's going on, we first need to recall some basic facts about how certificates work A certificate is a digitally signed assertion of the binding between a name and a public key. The data to be signed is as follows (I'm simplifying a bit)

versionThe version number (2)
serialNumberThe unique certificate serial number
issuerThe name of the CA issuing the certificate
validityThe time period when the certificate is valid
subjectThe name of the entity to which the certificate was issued.
subjectPublicKeyInfoThe entity's public key.
extensionsArbitrary extensions

In order to make a certificate, this data gets serialized using an annoying encoding and then the entire mess is hashed and then the resulting hash is digitally signed by the CA. The problem we have here is that the hash, in this case MD5, is weak. More precisely, it's possible to generate a collision: two inputs that hash to the same output. (See here for more background on attacks on hash functions.) We've known for years how to exploit this kind of attack. The basic idea is that the attacker prepares two documents, one "good" and one "bad" that hash to the same value. He then gets the signer to sign the "good" variant and then cuts and pastes the signature onto the "bad" variant, thus producing a valid signature on the bad document.

So, the way you would use this to attack certificates is that you would generate a "good" certificate signing request that would result in a certificate that had the same hash as a "bad" certificate you had generated locally. You get the CA to sign the request and then substitute the bad certificate. Until now there were two major obstacles to using this technique to attack certificates:

  • It wasn't clear that the serialNumber field was predictable.
  • The techniques for generating collisions weren't very good: they weren't that controllable (they generated a lot of random-appearing data) and were slow; or rather there were techniques for generating fast collisions but they weren't at all controllable.

The relevance of the serialNumber is this: unlike the name and the public key, the serialNumber and validity are generated by the CA. So, you need to know in advance what they will be in order to generate the appropriate colliding "bad" certificate. The validity is typically just generated as something like a year or two from the time of issue, so it's relatively predictable. The CA has a lot of freedom in how to generate the serial number. If it's truly a sequence number, it's quite predictable. However, if it's randomly generated, then it can be made arbitrarily unpredictable, which effectively blocks this kind of collision attack. When MD5 collisions were first discovered, the two standard recommendations were (1) stop using MD5 and (2) generate random serial numbers.

This Attack
Which brings us to this new work, which involves two main contributions. First, the authors improved their collision finding techniques so they need a lot less random-appearing data. The second is that they found a CA which still used MD5 and doesn't randomize the sequence number. Taken together, this allowed them to convince the CA to sign a certificate which was in itself valid but which collided with a certificate that the CA would never have signed, in this case a certificate for a new, subordinate CA. (It could just as well have been a certificate for a specific target web site, but that's less flashy than a CA certificate.) Once in possession of this new CA certificate, it's possible for the authors to sign arbitrary new certificates which will be trusted by anyone who trusted the original CA [subject to some technical limitations which I won't go into here.] Effectively, the authors have made themselves a CA.

There are some interesting technical hacks needed to make this work: although the serial number is somewhat predictable, it's not completely so, and in order to mount the attack they had to guess the serial number in advance. This guess isn't totally accurate, but they were then able to issue their own CSRs to increment the serial number to where they needed it to be.

The impact of this is that the authors could in principle mount man-in-the-middle or other impersonation attacks on any Web server provided that the client trusted this particular CA (most do). The existence of this certificate doesn't allow anybody else to mount impersonation attacks, since ordinary attackers won't have the corresponding private key (unless they break into the authors machine and recover it, of course). The authors have taken some steps to make the particular certificate they issued less useful for this purpose. In particular, it has a time way in the past, so unless your clock is way off, you should notice this attack. That's not to say that there's no risk here, since you might not notice the expiration date issue.

Of course, it's possible that an attacker could independently use the same technique to acquire their own CA certificate. In fact, we don't know for a fact that nobody already has. The only real obstacle is that the crypto needed here is fairly involved and the experts on it are mostly respected academics, many of whom are on this paper. So, the sooner that CAs adopt the mitigations mentioned above, the better.

I should mention that this isn't the only way to get a bogus certificate: many CAs don't do a particularly good job of user verification in any case (I'll be posting about one particular exceptional such case shortly). In particular, it's common to use "email confirmation" for identity verification, where the CA sends email to the administrator of the relevant machine to verify the certificate request. There are probably a number of cases in which it's easier to attack that than to build up a whole certificate collision infrastructure.

There are really two questions about how to contain this vulnerability:

  • What should we do about this specific certificate?
  • What should be done about the class of vulnerability?

The two basic options for this certificate are to ignore it (assume we trust the researchers, especially since the certificate is expired) or to blacklist it. The way that the blacklist would work is that the browser manufacturers would just issue a security update with a patch to the certificate validation code telling it not to trust this specific certificate, just as they would patch any other security vulnerability. For perspective, we can think of this as a vulnerability with an exploit that is known only to the researchers—even though we have the CA cert, we can't use it productively, and it's not likely to be reproducible. If I were in charge of a browser, which I'm not, I would probably issue a patch with a blacklist for this certificate. Others opinions may vary; as far as I know, the browser manufacturers didn't issue mandatory security updates blacklisting all the Debian OpenSSL keys, so that may be a cue to their general attitude.

The second question is what to do about this class of vulnerability. Because this attack only can be mounted against a live CA, not against an old certificate, it's very important that the affected CAs either stop using MD5, use randomized serial numbers, or both. Presumably, the news coverage will act as an inducement for them to do so. I've also heard suggestions that the browser manufacturers should disable MD5. There are probably still enough MD5-using servers out there that this would be problematic, though it's something to consider for the future.

Bottom Line
As usual, don't panic. In its current state, this is more of a demonstration of a hole than a serious hole. Countermeasures are readily available to the CAs and if the remaining CAs fix their practices fast enough, then it's unlikely that there will be any more bad certificates issued (it takes some time to spin up your infrastructure for this attack). Even if one or two such certificates are issued—even to bad guys— it's not the end of the world. Once they're detected they can be blacklisted. This takes a long time with the current patching rate, but it's not conceptually any worse than a remotely exploitable problem with your browser, or a bug in certificate validation logic, both of which have been known to happen. That said, it is very important that the CAs do fix their practices, since this has the potential to become serious if the capability to mount the attack becomes widespread and convenient.

UPDATE: Some minor corrections due to Hovav Shacham (only controllable MD5 collisions were slow)


December 20, 2008

According to recent news coverage [*] [*] [*] Estonia is going to start allowing voters to use mobile phones to authenticate themselves for e-voting. It's a little hard to decipher the coverage, but this article suggests that voters aren't going to use the phone for the entire process but instead are going to use Internet-capable computer terminals for voting and the phones purely for authentication:
Estonia has been at the forefront of electronic voting for a number of years. In 2005 it started using a national ID card for authenticating voters and giving the go-ahead for using mobile phones is a continuation of that, according to Silver Meikar, a member of the Estonian Parliament and a longtime proponent of e-voting.

Voters will be authenticated using a digital certificate stored on SIM (Subscriber Identity Module) cards, which are already available to Estonians.

"You still need a computer and the Internet, but now you will have a choice of using your ID card plus card reader or a mobile ID to authenticate yourself," said Meikar.

Next on the agenda for the parlilament following last Thursday's decision to allow mobile-phone authentication is to adapt the Internet voting system, which currently only supports the use of ID cards. "We are now starting to program the system, so at the moment we don't have the technical readiness," said Vinkel. Adding support for mobile authentication will take about six months, he added.

In general, I think it's pretty fair to say that computer security researchers have a pretty negative view of Internet-based voting systems of this type, regardless of the authentication mechanism. This is a fairly complicated topic, but I wanted to try to explain some of the concerns.

First, it's important to be clear what sort of system we're talking about. There are a lot of ways to use the Internet for voting (results transmission, ballot distribution, registration, etc.) and I guess you could call any of them "Internet Voting". For the purposes of this post, however, I'm talking about a system where users vote on their own computers or mobile phones which then transmit the results over the Internet back to a central consolidation point. One example of such a system is Everyone Counts though I don't plan to talk about this system specifically.

There are a number of concerns with any system of this type. A nonexhaustive list would look something like this:

  • How are voters authenticated?
  • How do you prevent remote compromise of the tabulation system/EMS?
  • How do you verify that your vote was correctly tabulated?
  • How do you prevent remote compromise of the voter's terminal?

Voter Authentication
The voter authentication problem is probably the easiest to solve from a technical perspective. First, we understand how to do remote user authentication pretty well (though user interface and user compliance remain serious problems). It's certainly a lot easier if you can force all users to take some sort of authentication token, which seems to be the situation in Estonia. Moreover, the standards for voter authentication seem to be pretty low in any case. When I worked the polls in Santa Clara County, for instance, we were told we couldn't ask for identification unless the voter roll specifically told us to, which was more or less for first-time voters. Given this, it seems like you could use SSL with client certificates based on the smartcard. It's a little hard to tell how the Estonian system works, but it's probably something vaguely like this; given that it's based on cell phones, it might be AKA or some other 3G-type authentication system.

Remote Compromise of the EMS
Remote compromise of the EMS/tabulation system seems a lot more problematic. Pretty much by definition, there needs to be some Internet accessible server to receive your votes—otherwise it's not Internet voting. This means you need to worry about compromise of that server. How serious such compromise is depends on the way you've constructed your voting system. The naive way to build the system is as a sort of virtual DRE: users send their votes to the server which records them in memory, increments counters, etc. At the end of the election, you just spit out the votes and/or counter values. In such a system, compromise of the central server is extremely serious: an attacker can simply have the system output any election results of his choice. However, there are a variety of cryptographic mechanisms for building systems that are much more resistant to such attack, and in the limit don't require trusting the central server to deliver correct results at all. I'll talk about this very briefly under the next hed.

However, cryptographic voting systems don't provide a complete solution to the problem of server compromise. In particular, while they guarantee correct tabulation (for some value of guarantee), they don't guarantee availability. Consider what happens if the central server goes down on election night and nobody can record their vote. More creatively, an attacker could selectively block voting from specific individuals based on (for instance) their voter registration. Even if an anonymous authentication mechanism were used [technical note: for instance, certificates signed with blind signatures], an attacker could use IP identification and geolocation technology to get a pretty good idea of who voters were or at least where they were and thus selectively disenfranchise certain voters. Sure, in principle the voters could protest and maybe somehow get their votes to count (though this is much more complicated than it looks, since you have to worry about people who didn't vote on election day deciding retrospectively that they should have and then claiming they were denied service), but in practice how many would do so? So, denial of service is a real concern here.

Verifying Correct Tabulation
As I said above, it's possible to produce cryptographic systems which allow the demonstration of correct tabulation without requiring you to trust the tabulator. The details are complicated, but it's easy to see how to do it if you don't mind people's votes being published. You simply submit a digitally signed copy of your vote to the server. The server publishes all the signed votes. Once the election is over, you can verify that your vote was posted and that all the votes add up. Note that this mechanism is deeply flawed: for starters, it's generally not considered OK to post every vote. However, building a system with appropriate privacy guarantees is much harder and requires a fair bit more crypto.

I'm not an expert in cryptographic voting, but as far as I can tell, all the known systems have two major drawbacks. First, they require at least some fraction of voters to check that their votes are correctly recorded. It's not clear that voters will do this in practice. Note that the system I described above doesn't have that problem, but only because we've obliterated all the privacy guarantees. The second, more serious, problem is that they're complicated and convincing the average voter that they really prove what they are supposed to prove is extremely difficult. There's a fair amount of skepticism outside the crypto community about the degree to which the public at large is willing to trust systems that they don't really understand. [Note that one could argue that that's true of current computerized systems, but they are more familiar in operation and of course there is widespread distrust of such systems.]

Compromise of the Voter Terminal
Finally, we have to consider remote compromise of the voter's computer. Again, more or less by definition it's on the Internet, and personal computers are notoriously poorly maintained and vulnerable to attack (hence botnets). This threat is the hardest to secure against. A compromised terminal can present any information to the user it pleases. For instance, it could claim you're voting for Jefferson when actually you're voting for Burr. Even if afterwards you check your vote on some other computer and discover the fraud, there's no way for the electoral system to distinguish this from user error or buyer's remorse. As long as consumer operating systems remain as insecure as they currently are, it's pretty hard to see how to deal with this problem adequately.