Daum and Lucks argue that this shows that the current attacks on MD5 are serious:
Recently, the world of cryptographic hash functions has turned into a mess. A lot of researchers announced algorithms ("attacks") to find collisions for common hash functions such as MD5 and SHA-1 (see [B+, WFLY, WY, WYY-a, WYY-b]). For cryptographers, these results are exciting - but many so-called "practitioners" turned them down as "practically irrelevant". The point is that while it is possible to find colliding messages M and M', these messages appear to be more or less random - or rather, contain a random string of some fixed length (e.g., 1024 bit in the case of MD5). If you cannot exercise control over colliding messages, these collisions are theoretically interesting but harmless, right? In the past few weeks, we have met quite a few people who thought so.With this page, we want to demonstrate how badly wrong this kind of reasoning is! We hope to provide convincing evidence even for people without much technical or cryptographical background.
Superficially, this a convincing argument, but I don't think it holds up under examination. First, consider the scenario Daum and Lucks envision:
- Alice prepares the pair of colliding files.
- The signing party views the "innocuous" version in a PostScript viewer. This is a key point because if you look at the source of PostScript file you can see both alternative documents (though of course one could obfuscate this...)
- The signing party signs the innocuous document.
- Alice transfers the signature to the "bad" version of the file and presents it to the relying party.
- The relying party then views the bad version (again in a PostScript viewer) and is fooled.
What makes this all work is that what's being signed is a program and that the victim only sees the program's output and is willing to sign based on that. But if you're willing to do that, you've already got a problem, even without compromise of digest functions. Consider the following document:
This file contains a simple JavaScript function that displays
one document fragment if the current month is June and the
other fragment if it isn't. The links below let you force the
switch:
Click here to change to Not June mode
Click here to change to June mode
This technique lets us mount a simple attack: prepare a document like the one above. Set it to display the innocuous message from days 1-5 and then a less innocuous message after day 5. Get the signing party to sign sometime on day 1. Then on day 6 present it to the relying party. The signing party and the relying party see different things, just as in the Daum and Lucks case.
There are a few obvious objections here. The first is that this is an HTML file, not a PostScript. PostScript does have conditionals, but it doesn't seem to have a Date operator. There is probably some other conditional you could use, but I haven't looked too hard. PDF, however, has support for JavaScript, so you may be able to make it work with PDF. In any case, it's not clear why one would think that people are more willing to sign PostScript than HTML.
Second, this attack isn't quite as elegant as the Daum/Lucks attack. The signing party might decide to look at the file later and notice what had happened. However, a Date is just the simplest kind of conditional. JavaScript is quite powerful, and you should be able to use more sophisticated mechanisms to figure out what to display, e.g., by checking some remote web page. Actually, if you have a network connection, you can mount this kind of attack without having any kind of program on the client: just have the "document" be an inline image linked to in the HTML file the victim signs. You can then make it appear any way you want whenever you want, and even condition the behavior on which computer is doing the asking.
The bottom line here is that you can't safely sign content that you didn't create based purely on the way it appears in some viewing application (this is one of the concerns with XML signatures as well [*]. Daum and Lucks have just found another way to demonstrate this.
In a sense, this seems like a social attack. Very few of the folks
that used them saw postscript files as programs; they saw them
as device-independent documents and ignored the fact that
it was a complex program that allowed them to be rendered
on multiple platforms. The programming language has primitives
that have nothing obvious to do with display, and that increases
the risk.
It's also terribly easy to obfuscate, since you can
include characters by drawing bitmaps of them; you
could design your postscript program so that viewing
the source would tell you nothing. For a long time,
the only way to get CJK characters in postscript was
by bitmapping them, so there are quite a few programs
out there that will help you create bitmaps (even of
non-CJK characters).
So the next thing to do is to think of other programming
languages that the public thinks of as device-independent
viewers. The public treats "the web" as device-indepent
viewing medium (at least largely), so pretty much any
of the client side technologies fit this suit.
Maybe we need an MD5 of intent: does this MD5
map to what I intend it to? (or, as relying party) can I check
the intent using this MD5?
There must be a URI scheme for this somewhere....
This sounds like another reason why document-as-program is harmful from a security point of view, and unlike most such problems it can't be fixed by better sandboxing.
All this trouble because of the word "signature." Digital signatures are nothing like real world document signatures, and it seems this "attack" goes away if the distinction is made.
As I recall the XML-Signature guys had the same problems with specifying how to sign XML docs because they could render in and number of different ways depending on context. Haven;t dug into the specs in a while, but there was a serious movement for a while to include and sign a bitmap version of the rendered document.
This is more of the same confusion. Signature in a legal sense should arise from a much more complex program context that specifies how to render, display signatures, etc.
Signature algs and hash functions have as much to do with signing as the properties of ink have on real world signatures.
Your [*] is broken; has an extraneous backslash.
I don't quite agree with Eric on this basic issue, but I'll point out that this attack scenario isn't really applicable to the non-repudiation world, where Alice fools Julius, and then tries to get a court to make him do something he didn't agree to. A serious examination of the source will reveal the attack when we're doing something as simple as conditional execution.
Where this is interesting is when someone views/reviews the data, approves and hashes it, and then some person or program makes a decision based on that hash or signature.
--John
This reminds me of a related issue in source code management: do you sign a (hash of a) patch or a (hash of the) state of the tree after the patch has been applied?
If you trust your software then both seem to be equivalent, but if you don't trust your software I think you are just screwed.