So, I'd been going around saying that the collision attacks against
MD5 and SHA-1 were pretty useless against real protocols. At ECRYPT,
Dan Bernstein pointed out to me that someone has actually used a collision
attack successfully against
APOP, an old challenge-response style protocol. The paper, by Leurent, is
here
(and rediscovered by Sasaki
here).
The way that APOP works (like pretty much all challenge-response protocols)
is that the server sends the client some challenge (a fresh value) M
and the client sends back F(K,M) where K is the shared
key and F is some function. In modern systems, people tend to
like HMAC for F but APOP was designed before HMAC and in an
era where people were pretty loose about hash functions. APOP computes
the response as: MD5(M || K). This turns out to enable an attack,
provided that the attacker can control the challenge.
The basic attack allows the attacker to determine one character of the
password. Say he thinks the last character is C
The attacker generates two colliding messages M1 and M2
with a special structure.
- They are one hash block long.
- The last byte of M1 is C.
- The last byte of M2 is also C [fixed -- EKR]
So, we have:
- M1 = xxxxxxxxxxC
- M2 = yyyyyyyyyyC
Where X and Y are arbitrary and come out of the collision
finding algorithm (actually, these are longer, but nobody wants
to see 63 xs.).
The attack then requires the user to authenticate twice. The first
time the attacker gives the challenge xxxxxxxxxx. This
causes the user to return H(xxxxxxxxxx || P1 || P2 || P3 || ... Pn)
where P1 is the first character of his password, P2
is the second, etc. The second time the attacker sends yyyyyyyyyy
and gets back H(yyyyyyyyyy || P1 || P2 || P3 || ...).
Now, here's the key point: these challenges have been specifically
arranged so that the first byte (P1) of the password lines up with
the last byte of the first hash block. If P1 == C then the two first
hash blocks will collide. And since P2 ... Pn are the same,
the entire hash output will collide. In other words, if the attacker
has guessed C correctly, then the responses to these two different
challenges will be the same. Otherwise they will not be (with high probability).
We've now extracted the first character of the password. That's not bad,
but what about the rest? Well, it's straightforward to extend this
once we know the first character. We build two new messages:
- M1 = xxxxxxxxxP1C
- M2 = yyyyyyyyyP1C
And we can repeat the same procedure extracting the password one character at a
time.
This attack strategy has been known for quite some time and is originally
due to Preneel and van Oorschot. However, the bottleneck
was always that finding collisions was too expensive. The discovery of
efficient paths for finding collisions changes that. If it's easy
to find collisions, then this method becomes very practical.
Of course, "easy" is where things get complicated. There are two factors
to consider here. The first is that it can be difficult to control
the collision values, so you don't always get to choose
that the last n characters of the colliding blocks are equal.
Indeed, Leurent reports that he can only recover the first three
bytes of the password. The second complication (and this applies only
to APOP) is that in APOP challenges are RFC-compliant message ids,
and the colliding blocks above most likely contain non-ASCII characters
and so don't fit. Implementations which check for compliance are
probably not vulnerable. However, Leurent
reports that he's successfully mounted this attack on a variety of
clients and it works, which isn't too surprising.
Note that improved collision-finding techniques could lead to relaxation
of both of these constraints.
The bottom line here is that if you're using APOP without TLS you
should probably stop. On the other hand, if you're using APOP without
some kind of encryption you should have stopped long ago...