A practical use of collisions

| Comments (2) | COMSEC
So, I'd been going around saying that the collision attacks against MD5 and SHA-1 were pretty useless against real protocols. At ECRYPT, Dan Bernstein pointed out to me that someone has actually used a collision attack successfully against APOP, an old challenge-response style protocol. The paper, by Leurent, is here (and rediscovered by Sasaki here).

The way that APOP works (like pretty much all challenge-response protocols) is that the server sends the client some challenge (a fresh value) M and the client sends back F(K,M) where K is the shared key and F is some function. In modern systems, people tend to like HMAC for F but APOP was designed before HMAC and in an era where people were pretty loose about hash functions. APOP computes the response as: MD5(M || K). This turns out to enable an attack, provided that the attacker can control the challenge.

The basic attack allows the attacker to determine one character of the password. Say he thinks the last character is C The attacker generates two colliding messages M1 and M2 with a special structure.

  • They are one hash block long.
  • The last byte of M1 is C.
  • The last byte of M2 is also C [fixed -- EKR]

So, we have:

  • M1 = xxxxxxxxxxC
  • M2 = yyyyyyyyyyC

Where X and Y are arbitrary and come out of the collision finding algorithm (actually, these are longer, but nobody wants to see 63 xs.).

The attack then requires the user to authenticate twice. The first time the attacker gives the challenge xxxxxxxxxx. This causes the user to return H(xxxxxxxxxx || P1 || P2 || P3 || ... Pn) where P1 is the first character of his password, P2 is the second, etc. The second time the attacker sends yyyyyyyyyy and gets back H(yyyyyyyyyy || P1 || P2 || P3 || ...). Now, here's the key point: these challenges have been specifically arranged so that the first byte (P1) of the password lines up with the last byte of the first hash block. If P1 == C then the two first hash blocks will collide. And since P2 ... Pn are the same, the entire hash output will collide. In other words, if the attacker has guessed C correctly, then the responses to these two different challenges will be the same. Otherwise they will not be (with high probability).

We've now extracted the first character of the password. That's not bad, but what about the rest? Well, it's straightforward to extend this once we know the first character. We build two new messages:

  • M1 = xxxxxxxxxP1C
  • M2 = yyyyyyyyyP1C
And we can repeat the same procedure extracting the password one character at a time.

This attack strategy has been known for quite some time and is originally due to Preneel and van Oorschot. However, the bottleneck was always that finding collisions was too expensive. The discovery of efficient paths for finding collisions changes that. If it's easy to find collisions, then this method becomes very practical.

Of course, "easy" is where things get complicated. There are two factors to consider here. The first is that it can be difficult to control the collision values, so you don't always get to choose that the last n characters of the colliding blocks are equal. Indeed, Leurent reports that he can only recover the first three bytes of the password. The second complication (and this applies only to APOP) is that in APOP challenges are RFC-compliant message ids, and the colliding blocks above most likely contain non-ASCII characters and so don't fit. Implementations which check for compliance are probably not vulnerable. However, Leurent reports that he's successfully mounted this attack on a variety of clients and it works, which isn't too surprising. Note that improved collision-finding techniques could lead to relaxation of both of these constraints.

The bottom line here is that if you're using APOP without TLS you should probably stop. On the other hand, if you're using APOP without some kind of encryption you should have stopped long ago...

2 Comments

I think you mean "the last byte of M2 is also C".

Is this really easier than a dictionary attack after one unsuccessful attempt? I guess that this attack works when the APOP password is not in any attack dictionary or algorithm, but I would still like to see a comparison between the work effort for the attack and a very deep dictionary run. Note that the dictionary attack is *much* less likely to raise suspicions since there is only one failure, not many.

Leave a comment