August 4, 2008

Authentication with low bandwidth trusted channels

Peter Saint-Andre recently suggested that we add a "Short Authentication String" (SAS) mode to TLS. SAS is only one solution to a not too uncommon problem: preventing man-in-the-middle attacks on public key protocols without the use of a third-party authentication system such as certificates or Kerberos. The general assumption is that you have some low-bandwidth, non-machine readable, trusted side channel (e.g., telephone) 1 that isn't good enough to do a real key exchange but that you want to use to bootstrap your way to a secure channel. You only need to do this once: after the first time you can have your software memorize the peer's public key or some other keying material and use that to provide continuity.

I'm aware of three major techniques here: fingerprints, password authenticated key agreement/exchange, and short authentication strings.

Fingerprints are probably the best known technique; they're what's used by SSH. You compute a message digest of your public key (or your self-signed certificate in DTLS-SRTP) and then communicate it to the other party over the trusted channel. Then when you do the key exchange over the untrusted channel, each side compares the other side's fingerprint to the key they presented. If they match, you're golden. If not, you may have been subject to a man-in-the-middle attack (or something else has gone wrong). The advantage of this technique is that you can compute a single static fingerprint and use it all the time, and the fingerprint can be public (this is what makes certificate systems work after all). Another advantage is that it's already compatible with TLS without any changes to the protocol. The disadvantage is that the fingerprint has to be comparatively long in order to prevent exhaustive search attacks on your public key where the attacker generates candidate private keys until they find one that has the right fingerprint. The complexity of this attack is dictated by the size of the fingerprint, so if you want 64 bits of security (probably the practical minimum), you need a 64 bit fingerprint, which means you're reading 16 hex digits over the phone, which starts to get into the inconvenient range.

Another approach is a password-authenticated key exchange (PAKE) system like EKE or SRP. These systems let you securely bootstrap a low-entropy secret password up to a secure channel [I'm not going to explain the crypto rocket science here] in a way that isn't subject to offline dictionary attacks; the attacker needs to form a new connection to one of the sides for each password guess it wants to verify. The big advantage of a scheme like this is that the password can be short—32-bits is probably plenty. The disadvantage is that you can't use a single password, you need a different one for each person you are trying to authenticate with. Otherwise one counterparty can impersonate the other. Again, this is compatible with TLS as-is, since TLS provides an SRP cipher suite.

Finally, there are SAS schemes such as that described by Vaudenay. The idea here is that the key agremeent protocol lets the parties jointly compute some value which they can then read over the secure channel. You need to take some care doing this because the protocol needs to stop an attacker from forcing the SAS to specific values, but there are well-known techniques for that (again, see the Vaudenay paper). One problem with schemes like this is that you can't exchange the SAS over the trusted channel until after you've done the key exchange, whereas with the other two schemes you can exchange the key in advance—though you don't have to with the fingerprint scheme and even with the SRP scheme there are ways to do it afterwards [technical note: do your standard key exchange with self-signed certs, then rehandshake with SRP over the other channel when you want to verify the connection and then pass a fingerprint over the SRP-established channel.].

None of these schemes is perfect. Optimally you'd be able to have a short, public, static authenticator, but I'm not not aware of any such system, so you need to figure out which compromise is best for your environment.

1. ZRTP carries the SAS in the established audio channel trusting voice recognition to provide a secure channel. There are some reasons why I don't think this will work well, but they're orthogonal to what I'm talking about here.

Posted by ekr at 10:16 PM | Comments (0)

July 14, 2008

More on IPETEE

In the comments section, Olle (the proposal author) responds to my comments on IPETEE:
"Like IPsec, IPETEE lives at the IP layer" No, IPSec is an IP protocol, IPETEE is an application layer wrapper totally independent of IP-transport. It could just as well be used over any other network transport.

"one could easily adapt IPsec so that the KMP ran over the application channel" Yes, but it still wouldn't be equivalent to IPETEE because the transport is different (IP protocol 50, etc.). IPETEE doesn't mess with or even care about the underlying transport.

"one could easily deploy something like this with either SSL/TLS or IPsec" Not with IPSec, since it isn't transparent to the underlying network (see above). You could certainly do it with a modified TLS implementation, but why carry all that extra baggage when a slim implementation of the bare essentials will do?

This proposal is, as you point out, still a sketch. Actually it is just a brain-dump of a drinking session. It was found and "leaked" by some blog and revealed to the world before it was ready for prime-time (it hasn't even been proof-read). Fine. I'll deal with that. What it seems to be lacking most is the rationale behind the design choices, so I'll try to add that during this week.

/olle (the proposal author...)

...

One more thing:

"they pick an odd set of algorithms, in this case Salsa-20 and AES-CBC"

What's so odd about these? They were only chosen because they are the currently most widely recommended stream and block ciphers. If you have alternatives you prefer, please explain why (as I said the proposal has yet to see any technical review).

The "odd" AES mode with implicit IVs and ciphertext-stealing was chosen to avoid changing the size of datagrams when encrypting, btw.

Cheers,

/olle

I'm still not sure what layer IPETEE runs at. If you're running HTTP does IPETEE run above or below TCP? This does matter, since if it's the former you can simply use TLS/DTLS, whereas if it's the latter, you need to do something new, though it may be quite modest, like a framing layer for DTLS/TLS. With that in mind, it's not clear to me how one makes a system that does per-flow keying but lives below TCP/UDP, since the concept of flows (in IPv4) is primarily one that exists at the TCP/UDP level.

With respect to the point about "why carry all that extra baggage when a slim implementation of the bare essentials will do?", there are a number of kinds of overhead here. At minimum, there's the cost of design and implementation, code size, CPU, and data size on the wire. It's true that you can reduce to some extent the code size and on-the-wire data size by doing a special purpose design (though this is less than you'd think in terms of code size, since people tend to use OpenSSL as their crypto implementation, which means that unless you're pretty careful you end up eating the code size anyway), but (1) code size isn't that important in most settings and (2) this comes at a really high cost in terms of design and implementation. Designing and implementing a good cryptographic protocol is hard, even for experts, and so doing one that isn't flawed is requires some serious thinking. And of course you can do a far more efficient implementation of SSL/TLS than OpenSSL in terms of code size if that's what you're optimizing for. It's not clear that you can do that much better with a custom protocol.

As far as on-the-wire data size and CPU cost, TLS/DTLS isn't optimal, but there's not that much room for improvement. There are five contributors to TLS/DTLS overhead:

You can reduce this somewhat without compromising security. I'm not going to work through the details here, but there's a certain minimum amount of overhead you need: a length field, a MAC to provide integrity (encryption wihtout integrity is dangerous business), and an IV and probably a sequence number if you're using datagram transport. The IPETEE claims to use a fixed IV and doesn't mention anything about a MAC or sequence number. This probably isn't safe except under fairly restricted attack models. You need to worry about both integrity attacks and pattern attacks from the fixed IV. (Incidentally, if you're going to use a stream cipher like Salsa20 for datagram transport, you need some method for using different keystream sections for each datagram or there are really serious integrity problems). The CPU requirements are similarly fairly constant.

WRT to the question of ciphers: if you want zero data expansion (and, as noted above, you generally do need *some* overhead) the standard procedure with AES is to use counter mode, not AES-CBC with ciphertext stealing. It's not clear what the advantage of Salsa-20 is, but it's not an algorithm that's commonly used in any protocol I'm familiar with. That's not to say there's necessarily anything wrong with it, but it's also not clear to me what the advantage is; standard procedure would be so stick with AES-CTR.

So far, I haven't heard any really compelling arguments why something entirely new is needed.

Posted by ekr at 9:16 PM | Comments (2)

July 13, 2008

IPETEE?

The Pirate Bay guys are floating a proposal for "Transparent end-to-end encryption for the internets". The basic idea seems to be IP-level encryption with an opportunistic, unauthenticated, inband key exchange:
The goal is to implement IP-transport encryption in a way that is transparent both to the IP-layer (including nodes in the network path) and to the applications that benefit from the encryption.

The solution inserts a crypto layer between the IP-stack and application. This could be implemented as a filter hook for an operating systems BSD-socket layer or as a network stack filter (Windows TDI, etc.).

Before establishing a "flow", defined as a new stream for stream oriented communications (i.e. TCP) and a new IP/port tuple for datagram oriented communications (i.e. UDP), key negotiation takes place over the data channel to establish a session key. If the key negotiation fails we fall back to unencrypted mode and just pass the application data untouched, otherwise the established session key is used to encrypt traffic before passing it down the stack and decrypt traffic before sending it up to the application.

This description is extremely sketchy, but it's still possible to get some initial impressions. As usual with amateur designs, this has some odd aspect. First, it looks like they're reinventing everything: both key management and packet formats. Second, they pick an odd set of algorithms, in this case Salsa-20 and AES-CBC with a fixed IV and ciphertext stealing.

But ignoring the details, it's interesting to look at the architecture. IPETEE isn't really isomorphic to any existing design.

Like IPsec, IPETEE lives at the IP layer, but unlike IPsec, where the key management protocol is out-of-band on a specific UDP port the IPETEE key management is in-band, apparently mixed with the application layer protocol. This has the advantage that there's much less of a NAT/Firewall traversal problem, since you don't need to worry about punching a hole through the firewall for the key management protocol. However, if I'm understanding correctly, because the key management data is on the same channel as the data, this means that if you try to connect to a node which isn't IPETEE-aware, you'll most likely cause a protocol error, which doesn't happen with IPsec, where your KMP just times out if the other side doesn't recognize it. Note that one could easily adapt IPsec so that the KMP ran over the application channel instead of on a separate channel.

SSL/TLS, of course, does all its key management (and everything else) in the data channel. So, as with IPETEE, there's no NAT traversal problem. The advantage of IPETEE over SSL/TLS (whether in application or SSL-VPN style applications) is that it will support any protocol that runs over IP, regardless of the transport protocol. It's not clear how big an advantage that is, since pretty much all major applications run over TCP or UDP, and so you can use TLS or DTLS.

The other advantages of IPETEE aren't really architectural, but rather implementation issues. First, unlike SSL/TLS/DTLS style applications is IPETEE is clearly designed to be transparent and automatic, as opposed under application control. However, that's not a protocol issue, but just an implementation issue. It's quite possible to do a kernel/driver version of SSL/TLS—this sort of thing was contemplated when SSL/TLS was first designed, but it didn't take off—to a great extent since one of the major advantages of SSL/TLS was that it could be implementated at the application layer and didn't require any messing around in the kernel or driver layers. There's a tradeoff here between universality and ease of deployment here.

Second, because IPETEE doesn't bother to authenticate either side of the connection, there's not really any endpoint configuration required beyond installing the software. Again, though, this isn't really an architectural advantage. As many have noted, one could easily deploy something like this with either SSL/TLS or IPsec, and IETF even has a working group (BTNS) doing something very similar for IPsec. Moreover, any opportunistic system (one where you don't know whether the other side will do security) has downgrade attack issues, where the attacker forces you down to cleartext. This system is actually worse, since an attacker can also man-in-the-middle you undetectably if there's no credential checking. Also, as Hovav Shacham pointed out to me, if you try to renegotiate with each connection, there are more downgrade opportunities. This can be dealt with to some extent by caching the other side's capabilities, but this interacts unpleasantly with NATs. Again, these are all architectural issues, not implementation ones; you just need to decide what tradeoffs you want.

None of this is to say that this system won't take off, of course, but from a technical perspective it doesn't seem like IPETEE has any major technical advantages that couldn't be easily gained by adapting well-understood existing protocols.

Posted by ekr at 10:17 PM | Comments (3)

June 11, 2008

Negative externalities of key cracking

Aside from being kind of pointless, Terence Spies pointed out to me today that there's a real negative externality to an attempt to crack Gpcode's RSA key. Once you've bothered to build a big distributed RSA key cracking system (this assumes of course that this is practical, which isn't clear), there's a temptation to use it, and there are lots of 1024-bit and smaller RSA keys floating around in the world. It's not at all clear that the benefit from cracking the public key used for a single piece of ransomware exceeds the cost of a crack of long-term keys used for legitimate purposes.

Posted by ekr at 6:21 PM | Comments (0)

June 10, 2008

On cracking ransomware

Gpcode is a "ransomware" virus that infects your machine, encrypts your data under some RSA public key, and asks you to pay money to get the decryption key. Kaspersky Labs is trying to start a project to crack the public key, which would allow them to recover the data. According to Kaspersky, they broke an earlier key because it wasn't generated securely, but it sounds like they're trying to attack this one directly. This seems pretty unscalable. Even if they do manage to factor the RSA modulus—which seems unlikely unless they gather a pretty surprising amount of computing power— whoever is releasing the virus can just create a new, longer, public key. The whole point of cryptography is to give an insurmountable advantage to the defender. That's not going to change just this time because the people using cryptography are mean.

Posted by ekr at 9:20 PM | Comments (4)

May 22, 2008

The Debian/SSL incident and Open Source software

As you may have heard by now, Debian introduced a distribution level patch to OpenSSL that pretty much completely wiped out the PRNG, with the result that it generated predictable keys. Plenty has been written about this, but it's worth noting that this bug has been hanging around for two years and was far from hidden. On the contrary, there was an outstanding bug documenting the "problem" that resulted in the patch and it wasn't hard to find the corresponding fix in Debian SVN. So, here we have a fairly obvious (to a security expert) error in a section of code that is well known to be security critical, specifically called out in the bug database and yet it took two years for someone to notice. What does that say about how difficult it would be to insert and hide a backdoor in a piece of software?

Posted by ekr at 6:24 PM | Comments (0)

May 13, 2008

Tunnel-only ISPs?

Lauren Weinstein is rightly concerned about Charter Communications' plans to "enhance" your browsing experience by injecting banner ads into your Web pages based on analysis of your browsing habits.

If this is something you're not that thrilled about, (which I can easily understand), then you might get to thinking what your options are. Charter offers an opt-out but as far as I know there's nothing forcing them to do so, and their opt-out appears to be pretty inconvenient:

Yes. As our valued customer, we want you to be in complete control of your online experience. If you wish to opt out of the enhanced service we are offering, you may do so at any time by visiting www.charter.com/onlineprivacy and following our easy to use opt-out feature. To opt out, it is necessary to install a standard opt-out cookie on your computer. If you delete the opt-out cookie, or if you change computers or web browsers, you will need to opt out again.

You could just change ISPs, of course, if you're lucky enough to live in a non-monopoly area and your other choices don't offer this enhanced feature set.

As Weinstein observers, one possible defense is to do HTTPS connections to every server, but that requires cooperation from all the server operators which has the usual network effect/collective action problems. But there's at least one obvious way to protect yourself unilaterally: set up a VPN to some provider who promises not to mess with your packets. You'd still be getting packet carriage from Charter, but they wouldn't be able to mess with your packets much, other than to drop or delay them. Certainly, they would not be able to inject their own traffic. This technique would probably introduce some latency, but the provider could locate their VPN concentrator near a major exchange point, which would reduce the latency quite a bit. The major obstacle would be finding someone to provide this service; I know there are providers which do IPv6 tunnels, but I don't know if they do v4 tunnels.

The effect of all this is to reduce your local ISP to raw packet carriage. Effectively, you're treating them long a long wire between you and your real ISP, the tunnel provider. Obviously, local ISPs could stop you from doing this, but it's hard to see on what grounds they would do so if they don't block enterprise VPNs.

Posted by ekr at 12:57 PM | Comments (5)

May 9, 2008

Notes on P2P Blocking and Evasion

In preparation for the IETF P2P Infrastructure Workshop, I've revised and expanded this post into a "position paper submission.

Introduction

In mid-2007 it was revealed [4] that Comcast was blocking peer-to-peer traffic (most famously BitTorrent) on their network by injecting RST packets to terminate TCP [7] connections. The BitTorrent community almost immediately discovered carrying BitTorrent over an encrypted tunnel (VPN or SSH) was not subject to blocking, thus completing another cycle of the ongoing arms race between peer-to-peer implementors and network operators. This paper explores some predictable next moves in the game and their consequences for the network.

This isn't intended to be comprehensive, because the request was for short papers, but I think it hits the high points. You can find the full note here.

Posted by ekr at 10:19 AM | Comments (0)

April 20, 2008

What the heck is Format Preserving Encryption?

Voltage (full disclosure: I have a number of friends there and I'm on their TAB) have released a technology they call Format-Preserving Encryption (FPE). The basic technology here is fairly old and is described in a paper by Black and Rogaway, but as far as I know, this is the first attempt to try to put it together in a single commercial package. Below I attempt to describe some of the relevant technical issues, which are sort of interesting.

Why FPE?
The use case for FPE is simple: say you have a database that contains information with multiple levels of sensitivity. So, for instance, if you're Amazon you might have a customer database that any employee can access but you'd like the credit card numbers to be accessible only to employees that really need it.

The classic approach here would be to use database access controls. This works well as long as you trust the DB server, but if, for instance, you want to send a copy of the DB to someone else, then you may not be able to trust their server, so you need to redact the database, which can be a pain. Another problem here is that sometimes sensitive information like CCNs is used for customer identification, which means you can't just redact the CCN. Rather, you need to replace it with something that's unique but doesn't leak the CCN itself. And of course, if someone compromises your database server, then all bets are off.

The problem with simple encryption
The natural alternative is to use encryption. Encrypting the whole database doesn't help, because you want users to have access to most of the database, just not to the sensitive fields. So, what you need to do is encrypt just the sensitive fields. This turns out to be trickier than it looks.

For example let's say we want to encrypt the social security number 123 45 6789 using AES-ECB. So, we might do:

This kind of sucks. Not only have we managed to start with a 9 digit string and end up with a 128-bit random-appearing value, none of the bytes of the output are ASCII digits. So, if our database or database software is expecting to have values for this field that look like SSNs, we've just broken that invariant.

The source of the problem, of course, is that we're using a block cipher in ECB mode, and most block ciphers come in a small number of sizes (64, 128, and 256 bits are the standard ones). A block cipher just randomly maps the input space onto the output space, so ECB mode encryption effectively selects a random b-bit value (where b is the block size). The smaller the fraction of the possible values that are valid, the higher the probability that the output will be invalid. To take the specific case of SSNs, there are approximately 2^{30} valid values (if we think of the trailing zeros as not counting), so the chance of producing a valid value by random chance is vanishingly small (order 2^{-98}).

One thing you might think would make sense would be to use a different mode than ECB, say counter. The problem with counter mode in this case is that you need to use a different section of keystream (or a different key) to encrypt each value to avoid easy cryptanalytic attacks. So, you need some per-value distinguisher that gets carried along with the ciphertext, which expands the amount of storage you need for the encrypted values, even as it keeps the ciphertext small.

Luby-Rackoff
As noted above, our big problem is our block size is too large. As noted above, even though SSNs are 9 digits long, they are sparsely packed (for instance letters aren't allowed), so there are approximately 2^{30} valid SSNs, as long as we use a better mapping than straight 1-1 digit correspondence. For instance, think of the 9 digit SSN as a value from 1 to 999,999,999 (not all 9-digit numbers are valid SSNs, but for simplicity, let's pretend they are.) We can represent that in binary as a 30 bit quantity. If we had a 32 bit block cipher, we could encrypt this value with less than 10% expansion, which might be OK under some circumstances (we'll describe how to do better below).

Ordinary block ciphers have blocks much larger than this, of course, but it turns out that there's a generic technique for making block ciphers of arbitrary size (actually, even values only), called Luby-Rackoff (L-R) . The nice thing about L-R is that it's a general construction based on a pseudorandom function (PRF), which we know how to build with standard cryptographic techniques.

Cycle Walking
We can use L-R to build a block cipher with a block size of any number of bits we want, but this still means that our function produces 2^b possible values where b is the block size, but this generally won't line up perfectly with the set of values we want to encipher. To return to our SSN example, we have 10^9 possible values, which means we need a block size of 30 bits, which implies a set size of 2^{30} = 1073741824. So, for any given input value, there's about a 7% chance that it will encrypt to an invalid SSN (greater than 10^9). If the database (or software) is really aggressive about validity checking, then you'll have an unacceptable rejection rate.

To deal with this issue, Black and Rogaway describe a technique they call "cycle-walking". The idea is that we start with an initially valid value (1-999,999,999) and then encrypt it. If the ciphertext is also valid, we stop and emit it. If it's invalid (greater than 999,999,999), we encrypt again, and repeat until we have a valid value. This gives us an encryption procedure that is guaranteed to produce an in-range output. Decryption is done in the same way.

Bottom Line
So, why can't we just use cycle-walking? Because it only works well if the block size is approximately right—if the size of the valid set is a lot smaller than the block size of the cipher, then you have to do a lot of iterations in order to get an in-range result. So, you can't use a 64-bit block cipher in order to encrypt an SSN because you end up having to do a prohibitive number if iterations; you need to use L-R to construct a block cipher of approximately the right size and then use cycle-walking to shave off the last few values.

UPDATE: Paul Hoffman pointed out to me privately that it's not clear how this all relates to FPE. Basically, FPE means the combination of L-R plus cycle walking. This lets you do one-to-one and onto encryption for most set sizes. If the set size is really small, there's another technique (also due to Black and Rogaway): you encrypt all possible input values and then sort the ciphertexts. You then use the index of the ciphertext in the sorted list as the encrypted value. This is obviously prohibitively expensive unless the number of possible values is small because it requires encrypting all possible values and then keeping a very large mapping table.

Posted by ekr at 9:43 AM | Comments (0)

April 12, 2008

TPMS and privacy

Schneier expresses concern about the tire pressure monitoring system (TPMS). The way TPMS works is that each wheel contains a pressure sensor and a radio transmitter which transmits pressure data to a receiver in the car, which somehow alerts you if the pressure is too low. The alleged problem is that in order to allow distinguishing wheels from each other (and from those in adjoining cars), each wheel has a unique identifier, raising the possibility that one could build a radio receiver which would listen for these transmissions and track your car.

Obviously, this isn't that attractive a feature, as Hexview observes:

What problems exactly does the TPMS introduce? If you live in the United States, chances are, you have heard about the "traffic-improving" ideas where transportation authorities looked for the possibility to track all vehicles in nearly real time in order to issue speeding tickets or impose mileage-adjusted taxes. Those ideas caused a flood of privacy debates, but fortunately, it turned out that it was not technically of financially feasible to implement such a system within the next 5-10 years, so the hype quickly died out.

Guess what? With minor limitations, TPMS can be used for the very purpose of tracking your vehicle in real time with no substantial investments! TPMS can also be used to measure the speed of your vehicle. Similarly to highway/freeway speed sensors that measure traffic speed, TPMS readers can be installed in pairs to measure how quick your vehicle goes over a predefined distance. Technically, it is even plausible to use existing speed sensors to read TPMS data!

...

As every other tracking technology, the TPMS was introduced as a safety feature "for your protection". One might wonder why NTHSA (a government agency) would care so much about a small number of accidents related to under-pressurized tires. And why would it choose to mandate TPMS and not run-flat technology? Are we being tracked already? I hope not.

It's absolutely true that NHTSA required TPMS. It doesn't look to me, however, like NHTSA required this particular implementation, or any particular implementation. They just required that the car be able to detect that the car be able to detect loss of pressure by more than 25%. As Hexview observes, there is a simple implementation that dramatically reduces the privacy problem: encrypt the sensor readings, and as far as I can tell this would be quite compatible with the NHTSA requirements (this doesn't totally reduce the problem because of radio fingerprinting, but this is harder than just reading the ID out of the air). The good news is that since there's no need for my car to be able to read your car's tire pressure, it's quite possible for manufacturers to do the right thing without any kind of new standard.

Hexview implies that NHTSA may have required TPMS in order to enable them to monitor your whereabouts, but I find that somewhat unlikely. Certainly, when I was involved with DSRC/WAVE, privacy was foremost on everyone's minds, so it would be strange of NHTSA were to deliberately attempt to violate driver privacy. That said, the manufacturers were also pretty concerned about privacy, so if they have rolled out a system that enables tracking, that's a little surprising.

Posted by ekr at 9:26 PM | Comments (3)

March 26, 2008

WoW, glider, and the difficulty of attestation

I'm not a WoW player but a bunch of my friends are, and they seem to put in a really enormous amount of hours just acquiring experience and loot. I guess this is pretty boring even by WoW standards, so it's not at all surprising that people have developed automatic WoW players. Now Blizzard is suing MDY, the creator of one such bot, MMO Glider Unsurprisingly, Blizzard doesn't like bots, since they provide a very substantial advantage to bot users over everyone else (again, I'm not a WoW player, but I think one has to concede they have a point here) and go to a lot of effort to block them.

Unfortunately for Blizzard, determining what software is running on a remote computer controlled by your adversary is known to be an incredibly difficult problem—as far as I know there is no general solution that doesn't involve some sort of trusted computing base on the remote computer (cf. TCG), 1 which of course most people don't have. That hasn't stopped Blizzard from trying, of course. They install a program called Warden on your computer which tries detect whether you're running cheat programs in parallel with WoW itself. Unsurprisingly, MDY has circumvention technology which evades Warden. So, from a technical perspective, this is a losing game for Blizzard. However, that doesn't mean that they can't win their lawsuit.

As I understand the situation, Glider isn't a WoW reimplementation, it's just a control program for WoW. So you start up WoW (or rather Glider does) and then Glider runs the various WoW operations for you. Blizzard argues that running WoW this way exceeds the EULA and so by building a tool designed to be used this way, MDY is engaged in contributory copyright infringement.

I'm not a lawyer, so I'm not going to offer an opinion on the value of this argument, but say that this holds up in court, does MDY have a technical recourse? That's a difficult problem. Since Glider depends on WoW, if they're enjoined from doing that, then life gets a lot harder. They obviously could do a WoW client implementation from scratch, but aside from that being a lot of work, it is actually incredibly easy for Blizzard to detect; they simply can have the server ask the client for a randomly chosen (by the server) section of its code. In order to emulate a real client, Glider would need to have a copy of the WoW client floating around. Would sending the requested copy to Blizzard then constitute copyright infringement as well?

1. The two contexts in which this problem is most relevant are DRM (where the content provider wants to be able to determine that the playing application will enforce its content controls) and network access control/network endpoint assessment, where the network wants to determine that an endpoint is uninfected. In neither case are there adequate solutions against an adversarial endpoint.

Posted by ekr at 10:27 PM | Comments (1)

March 13, 2008

IETF Report: routing security

The topic of routing security has started to heat up quite a bit in IETF. Historically, there have been two general types of routing security measures:

The second class of mechanisms (e.g., S-BGP) haven't really seen any significant deployment, despite the fact that there is a real threat from incorrect advertisements. (See this post about the Pakistan/YouTube outage for an example.)

The first class of mechanisms have seen modest deployments, but the protocols are fairly primitive, with insecure (or at least pre-modern) MAC function and minimal support for key management. Basically, you used a shared key between the communicating routers (a pair in the case of unicast protocols like BGP or LDP, or a group in the case of multicast/broadcast protocols like IS-IS or OSPF). All was well—or at least quiet—until 2005, when Bonica et al. published a draft which was intended to make key rollover easier for integrity protected TCP and also to update the MAC algorithms. This, coupled with some concerns about the lack of automated keying mechanisms, caused an avalanche efect of interest in revising all the routing adjacency security mechanisms.

IETF 71 had two meetings addressing this topic:

For some reason that's not entirely clear to me, I got sucked into this stuff. My materials are below:

Posted by ekr at 7:35 AM | Comments (0)

March 6, 2008

What happened in Pakistan?

There's more than one way to censor information you don't like on the Internet. At the end of February, Pakistan's Telecommunication authority decided they didn't like a specific YouTube video and issued an order requiring ISPs to block access to YouTube. The ISPs responded by advertising BGP routes to blackhole YouTube's traffic. Unfortunately, they screwed up and the routes leaked, bringing down YouTube for everyone. Danny McPherson at Arbor Networks has the story.
Either way, the net-net is that you're announcing reachability to your upstream for 208.85.153.0/24, and your upstream provider, who is obviously not validating your prefix announcements based on Regional Internet Registry (RIR) allocations or even Internet Routing Registry (IRR) objects, is conveying to the rest of the world, via the Border Gateway Protocol (BGP), that you, AS 17557 (PKTELECOM-AS-AP Pakistan Telecom), provide reachability for the Internet address space (prefix) that actually belongs to YouTube, AS 36561.

To put icing on the cake, assume that YouTube, who owns 208.65.153.0/24, as well as 208.65.152.0/24 and 208.65.154.0/23, announces a single aggregated BGP route for the four /24 prefixes, announced as 208.65.152.0/22. Now recall that routing on the Internet always prefers the most specific route, and that global BGP routing currently knows this:

  • 208.65.152.0/22 via AS 36561 (YouTube)
  • 208.65.153.0/24 via AS 17557 (Pakistan Telecom)

And you want to go to one of the YouTube IPs within the 208.65.153.0/24. Well, bad news.. YouTube is currently unavailable because all the BGP speaking routers on the Internet believe Pakistan Telecom provides the best connectivity to YouTube. The result is that you've not only taken YouTube offline within your little piece of the Internet, you've single-handedly taken YouTube completely off the Internet.

The problem here is that BGP security is a complete mess. To a first order anyone can advertise any route and they'll be believed. In other words, the Internet is horribly vulnerable to routing attacks. There's been some work in trying to prevent this sort of thing happening (whether via accidental misconfiguration or worse yet, maliciously) but none of the solutions (S-BGP, SoBGP, etc.) but none of it has gone very far, in part because many of the proposed designs are really heavyweight and in part because (or so I'm told) the database of who actually owns what prefix is in such bad shape that you can't use it as a basis for cryptographic assertions about who can advertise what.

Posted by ekr at 10:02 PM | Comments (1)

February 27, 2008

IETF 71 Reading List

C. Jennings, B. Lowekamp, E. Rescorla, J. Rosenberg, S. Baset, H. Schulzrinne, REsource LOcation And Discovery (RELOAD), draft-bryan-p2psip-reload-03.txt.

D. McGrew, E., Rescorla, Datagram Transport Layer Security (DTLS) Extension to Establish Keys for Secure Real-time Transport Protocol (SRTP), draft-ietf-avt-dtls-srtp-02.txt.

J. Fischl, H. Tschofenig, E., Rescorla, Framework for Establishing an SRTP Security Context using DTLS, draft-ietf-sip-dtls-srtp-framework-01.txt.

E. Rescorla, Keying Material Extractors for Transport Layer Security (TLS), draft-ietf-tls-extractor-01.txt.

T. Dierks, E. Rescorla, The Transport Layer Security (TLS) Protocol Version 1.2, draft-ietf-tls-rfc4346-bis-09.txt.

Posted by ekr at 9:33 PM | Comments (1)

Why would you want an identity-based signature?

One of the first rules of crypto is that if there's a crypto primitive that's possible to build, no matter how stupid, someone will eventually build it. Nothing wrong with that—that's what cryptographers are supposed to do. But just because something is possible doesn't mean it's useful. Case in point, identity-based signatures. You may have heard of Identity-Based Encryption, in which the public key and private key are derived from your identity (e.g., your email address). Anyone can compute the public key, but you need to get the private key from a key generating authority (KGA) which serves a similar role to the CA in a PKI system. The value proposition here is that you don't need a copy of someone's certificate in order to encrypt a message to them—you can compute their public key knowing only their identity (and which KGA they use). More on this here. This means that there's no need for a certificate directory, which has historically been one of the inconvenient parts of PKI.

Unsurprisingly, IBE has a signature variant, known as Identity-Based Signatures. The basic concept here is the same: the public key is derived from your identity and you get your private key from the KGA. The value proposition is the same too: anyone can verify your signature without having your certificate. The problem is that it doesn't really add much value. In a PKI system, when you send a signed message you send (Message, Signature, Certificate). In an IBS system, you sent (Message, Signature, Identity). Otherwise, the data flow is the same. Basically, IBS is just a fancy (OK, really fancy) way of compressing the signer's certificate. 1

So, why am I going on about this? Someone just suggested using IBA in the IETF SIP WG. (draft here, mailing list discussion here, starting with my review.).

1. Indeed, as Hovav Shacham pointed out to me, the difference between an ordinary PKI system and an IBS system is to some extent a matter of semantics. Think of the certificate as part of the signature and certificate verification as part of the signature verification. It's true that the signature isn't deterministic, but then plenty of signature schemes (e.g., DSA), aren't.

Posted by ekr at 9:19 PM | Comments (3)

February 20, 2008

Wikileaks shut down

Cayman bank Julius Baer Bank and Trust has convinced a federal judge to shut down DNS service for wikileaks.org.
On Friday, Judge Jeffrey S. White of Federal District Court in San Francisco granted a permanent injunction ordering Dynadot, the site's domain name registrar, to disable the Wikileaks.org domain name. The order had the effect of locking the front door to the site -- a largely ineffectual action that kept back doors to the site, and several copies of it, available to sophisticated Web users who knew where to look.

Domain registrars like Dynadot, Register.com and GoDaddy .com provide domain names -- the Web addresses users type into browsers -- to Web site operators for a monthly fee. Judge White ordered Dynadot to disable the Wikileaks.org address and "lock" it to prevent the organization from transferring the name to another registrar.

The feebleness of the action suggests that the bank, and the judge, did not understand how the domain system works, or how quickly Web communities will move to counter actions they see as hostile to free speech online.

The site itself could still be accessed at its Internet Protocol address (http://88.80.13.160/) -- the unique number that specifies a Web site's location on the Internet. Wikileaks also maintained "mirror sites," or copies usually produced to ensure against failures and this kind of legal action. Some sites were registered in Belgium (http://wikileaks.be/), Germany (http://wikileaks.de) and the Christmas Islands (http://wikileaks.cx) through domain registrars other than Dynadot, and so were not affected by the injunction.

There's also a mirror at cryptome.

For those of you who don't know how this all works, there's registries, who actually run the domain name (.org in this case) and then there are registrars, who actually deal with the customers. Any given top level domain typically has multiple registrars that service it, all of whom populate the same database, operated by the registry. So, the locking thing stops Wikileaks from transferring their domain to another registrar who would then reactivate it.

OK, so this order controls the registrar. But can Wikileaks just go to the registry and get them to move it to some other registrar, locking notwithstanding? In this case, Wikileaks is under .org, which is run by the Public Interest Registry. Operationally, the PIR is run by Afilias. Both of these are based in the US, so presumably the injunction could be expanded to include them as well. On the other hand, as the article notes, there are plenty of registries with no US connection and the only way for a US judge to take down them domains under them would be to go after ICANN, which, despite complaints about the US running the DNS seems pretty unlikely.

As you may be gathering at this point, this is all pretty pointless. It's basically impossible to censor stuff like this once it gets out. We're seeing the first level of countermeasure here, but even if by some miracle the judge managed to shut down every domain name serving the contraband material (and since the decision loop for spreading those domain names is a lot faster than your average judge's decision making process), people can just move to IP addresses published by some other means (like other people's web sites). And there are about three levels of escalation up from there, all of which are progressively harder to censor.

It will be interesting to see if JBBT goes after cryptome.org, though.

Posted by ekr at 9:42 PM | Comments (0)

February 16, 2008

Overproduction

The EFF has obtained a document under FOIA describing an incident in which an email provider which was served by an NSL for some email communications and accidentally sent far too much information to the FBI:
In late February 2006, a surge in data being collected by the FBI's Engineering Research Facility (ERF) was identified by ERF personnel. As a result ERF investigated the issue and recognized that the collection tools used to collect email communication from the subject of the investigation were improperly set and appeared to be collecting data from the entire email domain. due to an apparent miscommunication, the private internet provider accidentally collected mail from the entire domain and subsequently conveyed the email to ERF.
(NYT story here).

I'm sort of curious what kind of tools the ISPs are using here. You certainly can reconfigure your mailer to forward copies of emails to certain addresses to somewhere else, though mail going out is a little trickier. In any case, I'd be a little surprised if the FBI expected something quite so DIY. Maybe when they send you an NSL it comes with a pamphlet telling you how to reconfigure Outlook.

Apparently, this happens reasonably often. The FBI calls it "overproduction":

A report in 2006 by the Justice Department inspector general found more than 100 violations of federal wiretap law in the two prior years by the Federal Bureau of Investigation, many of them considered technical and inadvertent.

...

In the warrantless wiretapping program approved by President Bush after the Sept. 11 terrorist attacks, technical errors led officials at the National Security Agency on some occasions to monitor communications entirely within the United States -- in apparent violation of the program's protocols -- because communications problems made it difficult to tell initially whether the targets were in the country or not.

Past violations by the government have also included continuing a wiretap for days or weeks beyond what was authorized by a court, or seeking records beyond what were authorized. The 2006 case appears to be a particularly egregious example of what intelligence officials refer to as "overproduction" -- in which a telecommunications provider gives the government more data than it was ordered to provide.

The problem of overproduction is particularly common, F.B.I. officials said. In testimony before Congress in March 2007 regarding abuses of national security letters, Valerie E. Caproni, the bureau's general counsel, said that in one small sample, 10 out of 20 violations were a result of "third-party error," in which a private company "provided the F.B.I. information we did not seek."

To quote Broken Arrow, " I don't know what's scarier, losing a nuclear weapon or that it happens so often there's actually a term for it." Outstanding!

Posted by ekr at 9:19 PM | Comments (0)

February 15, 2008

I could tell you but then I'd have to kill you

Now that the House has at least temporarily refused to pass the extension of the administration's warrantless wiretapping power, there's a lot of talk about how it's destroying the security of America. For instance, here's President Bush:
"Our intelligence professionals are working day and night to keep us safe," Mr. Bush said, "and they're waiting to see whether Congress will give them the tools they need to succeed or tie their hands by failing to act."

Obviously this could be true, but we have no way to tell whether it is or not because from the beginning the Bush Administration has kept pretty much all the details about the program, including whether it's done anything useful, secret, even from Congress:

(CBS/AP) With legislation that would legalize President Bush's eavesdropping program entangled in a battle over the side issue of corporate immunity, the White House sought to move the process forward by acceding to requests from the Senate Judiciary Committee to view classified documents its members have long demanded.

However, the White House continued to draw a line between Senators and House members.

Senate Judiciary Committee Chairman Patrick Leahy, D-Vt., had demanded that other members of the panel have the same access to the same documents before he considers giving immunity to telecommunications companies that may have tapped Americans' telephones and computers without court approval. The measure is an amendment in the Senate's version of the bill rewriting the Foreign Intelligence Surveillance Act (or FISA).

White House Counsel Fred Fielding had offered to let Chairman Patrick Leahy and ranking Republican Arlen Specter see documents that might persuade them to include liability protection for telephone companies, but initially only to them.

Later Thursday, the White House agreed to expand the documents' distribution.

I'm not saying that this program isn't essential. The problem is that we have no way of knowing because the administration has deliberately denied the public the information it would need to assess what the program is and how and whether it works. We're told that that information is classified and that it's strongly implied that if we did have the information we would agree that it was important.

Again that could be true, but I remember back in the 90s when the debates over cryptography export controls were going on and we were told almost exactly the same thing, namely that wiretapping was really important and that if we just could see the classified information we would be in favor of keeping them. There was widespread skepticism about these claims on the not entirely implausible theory that the NSA might not be entirely objective about the tradeoffs between their desire to listen in on everyone's communications and people's desire to keep them private, and that just maybe it was a lot easier for them to make their case if, you know, the public didn't know anything. Anyway, when the NRC committee studying crypto policy investigated them they concluded that"

This unclassified report does not have a classified annex, nor is there a classified version of it. After receiving a number of classified briefings on material relevant to the subject of this study, the fully cleared members of the committee (13 out of the total of 16) agree that these details, while necessarily important to policy makers who need to decide tomorrow what to do in a specific case, are not particularly relevant to the larger issues of why policy has the shape and texture that it does today nor to the general outline of how technology will and policy should evolve in the future.

The basic problem here, as with the cryptography issue, is that there's a conflict of interest when the people who favor some particular policy also control the supply of information about the merits of that policy—they have a natural incentive to characterize the evidence in the way most favorable to their position. This is of course natural, but it should make you pretty suspicious when you're told that you can't have the information you would need to make an informed decision.

Posted by ekr at 10:18 PM

January 26, 2008

Skype Lawful Intercept

News is circulating of a German plan to build a "Skype-Capture-Unit", software which would live on your computer (be surreptitiously installed by the government) and capture the media for analysis. This is necessary because Skype is encrypted so ordinary capture mechanisms just get ciphertext. It's a little hard to read what's being proposed, but it sounds like the software would actually divert a copy of the plaintext to the monitoring station.

If this is indeed what the German government is planning on doing, it's actually kind of lame. First, it's inefficient since you need twice as much bandwidth, for the original media stream and the copy to the monitoring station. Second, it's easy to detect, because you're using a lot more bandwidth. An approach while would be much harder to detect would be to arrange to leak the encryption key and then capture the ciphertext using standard monitoring techniques. The key leakage can be done in such a way that it's very hard to detect.

The document also describes an SSL interception system. I'm finding it a little hard to decode, but it talks about a man-in-the-middle attack, which also easier to detect than necessary. Again, this doesn't seem like the most efficient technique—easier to just leak the keys.

As I've mentioned before, since Skype controls the software, they could assist the government with LI if they chose. This document is at least suggestive that they're not doing that.

Posted by ekr at 8:41 PM | Comments (0)

January 24, 2008

Telecom immunity is back

As you may have heard, the FISA telecom immunity bill is back. If you haven't heard, the administration is pushing a bill that would, among other things, provide retroactive immunity for telecoms who participated in the warrantless wiretapping program. A few months ago when this last up for debate, I wrote to Dianne Feinstein about this. Probably not uncoincidentally, I got an email from her office about this the other day:
The Intelligence Committee's report on the bill includes declassified text stating that the Executive branch provided letters to electronic communication service providers at regular intervals. These letters all directed or requested assistance and noted that the assistance was authorized by the President and was legal. The Committee's report can be found at http://intelligence.senate.gov/071025/report.pdf.

I introduced an amendment on the Senate floor that would limit this grant of immunity. Under my amendment, cases against the telecommunications companies would go to the FISA Court for judicial review. The Court would only provide immunity if it finds that the alleged assistance was not provided, that assistance met legal requirements, or that a company had a good faith, reasonable belief that assistance was legal.

This seems like a pretty low bar. There are actually three cases:

  1. The telcos thought that they were legally required to enable wiretapping.
  2. The telcos thought that they didn't have to enable wiretapping but that it was legally permitted.
  3. The telcos thought that they were legally forbidden from enabling wiretapping.

The basic rationale for immunity seems to be that the telcos thought they were doing their civic duty and shouldn't be punished if it turns out that it was actually illegal (note that this stance is a bit belied by the much-publicized revelation that the telcos stopped the wiretaps when the government didn't pay). This isn't crazy: certainly, if the telcos were in receipt of a court order directing them to wiretap some set of communications I would expect them to comply (though a telco which was known to have actively resisted the order would certainly be one I'd want to give my business to) and a grant of immunity seems reasonable in such a case—though I'm not sure that one was required. So, if the telcos can demonstrate that they actually had a good faith belief that they were legally required to comply then immunity seems appropriate.

Similarly, if it turned out that the telcos thought they were actually violating the law then immunity seems totally unreasonable. On the other hand, it would be fairly unsurprising if they were stupid enough to leave records lying around that said "let's do this totally illegal thing." Is there anyone who thinks that they should have immunity in this case? (This isn't to say that the law as currently proposed doesn't grant immunity here—I haven't checked—in which case Feinstein's amendment would be an improvement.)

So, case (2) is the interesting case: the telcos thought they had some discretion and decided to exercise it in the government's favor and not that of their customers. That's certainly a reasonable business judgement and of course there are powerful reasons for getting on the government's good side, but getting sued and losing a lot of money in case what they've decided to do is actually illegal is the business risk you take in such cases. If you want (and I do) the telcos to take any interest at all in your privacy, then they actually have to bear some risk in cases when they decide not to do so.

That said, whether the telcos get punished is actually not the most important piece here. As I understand it, one effect of the immunity grant is to effectively foreclose a lot of the lawsuits currently filed against the telcos. Since those suits were a major avenue for public discovery of what really happened in this program, the immunity grant would also act to keep the details of the program secret, which is bad if you think that this is the kind of thing that ought to be publicly discussed rather than just done in secret. I'd be much more receptive to a bill which granted immunity in return for full disclosure, but of course that's not what Feinstein's amendment does, since the immunity determination is made by the FISA court.

Posted by ekr at 9:26 PM | Comments (2)

January 9, 2008

A cheap anti-frontrunning countermeasure: digested WHOIS

The reason that front running works at all is that WHOIS leaks the name of the domain you're looking to the WHOIS service operator and in this case the operator is the adversary, thus giving them an opportunity to get ahead of you. The usual answer to this problem is to create a set of policies that treat WHOIS queries as sensitive information (see ICANN's study on front running). One could, for instance, require the WHOIS operator to treat WHOIS queries as private.

However, there's a cheap, compatible, technical hack that substantially increases the difficulty of front running attacks without any new policies: allow WHOIS searches on hashes of domain names. The way this would work is that the WHOIS operator would create a parallel tree of phony domain name registrations in WHOIS. For instance, if I registered example.org, then they would also create an entry for SHA-1(example.org)=20116dfd6774a9e7b32eddfea3f6cb094e38fc3f.org (we might need to register a new TLD to make this work and guarantee no collisions) and populate it with the record for example.org. Then, I could locally compute the hash for each domain name I wanted to check on and easily verify its existence or nonexistence.

Other properties:

You still might need ICANN or someone like them to force the operators to do this, since it's not clearly in their interest. On the other hand, direct cost to them is so low that it's hard to really object to on difficulty grounds.

Technical Note 1: This problem is related to private information retrieval but is dramatically easier because we don't care about the server knowing what record we fetched if it exists, only if it doesn't exist. Actually, we only care if the record exists. We don't need the record itself, which makes the problem yet easier.

Technical Note 2: It would be nice to have a solution that didn't allow dictionary attack. The best solutions I know use Bloom filters.

Note that the hash solution isn't technically constant size in the number of registered names either—the required hash size for any given false negative probability depends on the number of registered names. However, since 160 bits is so small people just think of this as constant size.

Posted by ekr at 8:34 AM | Comments (5)

January 8, 2008

What the fxck?

So, the other day I needed to register an account with Chase. Part of the registration procedure is the now ubiquitous out-of-band delivery of some token/code to your cell phone, email address, etc. (I've heard this called "loopback" or "answerback").

Here's what they offer me (somewhat reformatted and trimmed, and with my phone number redacted even further):

We'll show you all contact information we have on file for you. Some of it may be outdated, but we'll only send you an Activation Code using the method you select. Note: For security reasons, we have hidden parts of your contact information below with "xxx." Learn more about why we do this.

...

Phone Call to :
 xxx-xxx-YYYY
 xxx-xxx-WWWW
 xxx-xxx-YYYY

OR E-mail Message to: exr@rtfm.com

First, note the duplication of the phone numbers: YYYY appears twice. A minor issue, though. Note also the substitution of the numbers with xxx.

Now check out the email address: exr@rtfm.com. Note that the middle letter of my username is k, not x, and this isn't xxx either. Anyway, so I'm registering here for the first time and I look at this clearly wrong address, and I figure it's a typographical or transcription error (x looks like k, after all) and I should call get it fixed. But no.... this is just their masking technique.1 OK, so maybe I'm a moron, but seriously, how hard would it be to use * instead of x, like everyone else?

1. Mrs. Guesswork hypothesizes that their algorithm is to take the middle part of my username and so if my username were longer I would get the full xxx treatment.

Posted by ekr at 8:21 PM | Comments (4)

January 1, 2008

Yeah, you need to watch that hard drive

Dave Winer is unhappy that he took his Mac to the Apple store with a broken hard drive. Apple replaced the drive but then wouldn't give it back. Winer claims they're going to refurbish it and give it to someone else and is concerned about data leakage.

I share this concern. I generally don't let others have access to my hard drive even if I expect them to give it back—for instance if they're repairing some other part of the computer. In theory, you can clean off the hard drive if it's functioning properly, so you can take a backup, wipe the drive, and then restore it when the computer comes back. But of course once the hard drive itself starts to fail, then disk wiping tools present an obvious problem, so you either need to keep possession of the hard drive, or use encryption. Encryption has the obvious advantage that you don't need to replace your own hardware, but of course it's more of a pain to use upfront and you need to worry about losing your data if you lose the encryption key (that's kind of the point, after all).

That said, I do kind of wonder whether the drive is actually going to be refurbished. Hard drive technology changes pretty fast and I wonder if it's really worth refurbishing old drives.

See also FSJ on Winer.

Posted by ekr at 8:55 PM | Comments (9)

December 23, 2007

Transitioning to universal HTTPS (2)

In my original post on Loren Weinstein's suggested adoption of universal HTTPS, I said that MITM attacks were a issue I would address in a separate post. This is that post. As cryptographers and COMSEC engineers never tire of pointing out, if your channel isn't authenticated then you're very vulnerable to active attackers. The classic attack is what's called a man-in-the-middle 1 (MITM) attack, but in general the problem is that you can end up talking to the attacker when you think you're talking to the right person. There are a lot of proposed solutions for this, but the only one that really works when you're trying to talk to someone you don't know is to have someone you do know (or at least trust) vouch for them. In TLS this is done with certificates, and the third party you trust is the certificate authority.

Whenever this topic comes up, you hear a lot of complaining about the difficulty and expense of obtaining certificates. For instance, here's Weinstein:

Certificates are required to enable TLS encryption in these environments, of course. And while the marketplace for commercial certs is far more competitive now than it was just a few years ago, the cost and hassle factors associated with their purchase and renewal are very relevant, especially for larger sites with many operational server names and systems.

It's certainly true that certs are an obstacle, though not as big an obstacle as people think. You can get a certificate for as little as $9/year. It's a little inconvenient, but it wouldn't be hard for Web hosting providers (who typically charge rather more than that) to simply issue you a certificate (or work with a CA to do so) as part of your Web site setup. But still, this is obviously more inconvenient than not doing anything. So, do you need a certificate at all? Here's Weinstein again:

However, in a vast number of applications where absolute identity confirmation is not required (particularly when commerce is not involved), self-signed certificates are quite adequate. Yes, as I alluded to in my previous blog posting, there are man-in-the-middle attack issues associated with this approach, but in the context of many routine communications I don't feel that this is as high a priority concern as is getting some level of crypto going as soon as possible.

Given their significant capabilities, why then are self-signed certs primarily employed within organizations, but comparatively rarely for servers used by the public at large, even where identity confirmation is not a major issue?

A primary reason is that most Web browsers will present a rather alarming and somewhat confusing (for the typical user) alert as part of a self-signed certificate acceptance query dialogue. This tends to scare off many people unnecessarily, and makes self-signed certificate use in public contexts significantly problematic.

Security purists may bristle at what I'm going to say next, but so be it. I believe that we should strongly consider something of a paradigm shift in the manner of browsers' handling of self-signed certificates, at the user's option.

When a browser user reaches a site with a self-signed certificate, they would be presented with a dialogue similar to that now displayed, but with additional, clear, explanatory text regarding self-signed certificates and their capabilities/limitations. The user would also be offered the opportunity to not only accept this particular cert, but also to optionally accept future self-signed certs without additional dialogues (this option could also be enabled or disabled via browser preference settings).

This topic has been debated endlessly on mailing lists, so I have no intention of detailing all the arguments. Instead, here's the bullet point version.

For

Against

This doesn't really cut in either direction, but another possibility is to reserve the https: URL scheme for real certs but to have clients auto-negotiate SSL/TLS silently where possible (like RFC 2817 but done right). This at least gives you channel confidentiality, and if you cache the fact that you negotiated SSL/TLS, then some active attack resistance.

Note that an active attacker can of course downgrade you to straight HTTP (who knows how people respond to whatever warning accompanies "hey, I just negotiated HTTP even though before I was doing HTTPS?") but, then, they could MITM self-signed certs and Weinstein's argument that they won't:

Any ISP that was caught playing MITM certificate substitution games on encrypted data streams without explicit authorization would certainly be thoroughly pilloried and, to use the vernacular, utterly screwed in the court of public opinion -- and quite possibly be guilty of a criminal offense as well. I doubt that even the potentially lucrative revenue streams that could be generated by imposing themselves into users' Web communications would be enough to entice even the most aggressive ISPs into taking such a risk. But if they did anyway, the negative impacts on their businesses, and perhaps on their officials personally as well, would be, as Darth Vader would say, "Impressive. Most impressive."

seems kinda shakey to me.

1. Amusingly, the Wikipedia entry on Interlock, a protocol designed to stop MITM, reads in part:

Most cryptographic protocols rely on the prior establishment of secret or public keys or passwords. However, the Diffie-Hellman key exchange protocol introduced the concept of two parties establishing a secure channel (that is, with at least some desirable security properties) without any such prior agreement. Unauthenticated Diffie-Hellman, as an anonymous key agreement protocol, has long been known to be subject to man in the middle attack. However, the dream of a "zipless" mutually authenticated secure channel remained.

As far as I know, this use of the term "zipless" comes from Erica Jong's novel Fear of Flying, where it refers to a somewhat different type of interaction.

Posted by ekr at 5:46 AM | Comments (3)

December 19, 2007

Hey, mind if we tap that phone?

Ryan Singel reports that despite the rather lax standards required for wiretaps, some FBI agents seem to have decided that they could skip procedure:
The revelation is the second this year showing that FBI employees bypassed court order requirements for phone records. In July, the FBI and the Justice Department Inspector General revealed the existence of a joint investigation into an FBI counter-terrorism office, after an audit found that the Communications Analysis Unit sent more than 700 fake emergency letters to phone companies seeking call records. An Inspector General spokeswoman declined to provide the status of that investigation, citing agency policy.

The June 2006 e-mail (.pdf) was buried in more than 600-pages of FBI documents obtained by the Electronic Frontier Foundation, in a Freedom of Information Act lawsuit.

The message was sent to an employee in the FBI's Operational Technology Division by a technical surveillance specialist at the FBI's Minneapolis field office -- both names were redacted from the documents. The e-mail describes widespread attempts to bypass court order requirements for cellphone data in the Minneapolis office.

Remarkably, when the technical agent began refusing to cooperate, other agents began calling telephone carriers directly, posing as the technical agent to get customer cellphone records.

Federal law prohibits phone companies from revealing customer information unless given a court order, or in the case of an emergency involving physical danger.

The actual document is here.

Posted by ekr at 8:38 PM | Comments (1)

December 14, 2007

Password disclosure and the 5th Amendment

Orin Kerr points to the decision in in re Boucher where a magistrate ruled that forcing someone to disclose their PGP password violates the Fifth Amendment. This question has been the topic of an unbelievable amount of amateur lawyering on cypherpunks and associated mailing lists and a lot of that gets repeated in the Volokh Conspiracy comments. The key question seems to be whether disclosing their password is a testimonial or non-testimonial act. I'm no expert on this topic, but as I recall, in past discussions, people have suggested having a password which is inherently self-incriminating (e.g., "I murdered John Doe") in an attempt to create a Fifth Amendment situation, which always seemed to me to be too clever by half.

Posted by ekr at 9:03 PM | Comments (4)

December 12, 2007

Transitioning to universal HTTPS

Lauren Weinstein points out that the assclowns at Rogers are prototyping a system for splicing their own messages into other people's Web pages, like this:

Lauren argues that it's time to abandon unprotected web surfing:

That first, key action is to begin phasing out, as rapidly as possible and in as many application contexts as practicable, the use of unencrypted http: Web communications, and move rapidly to the routine use of TLS/https: whenever possible.

This is of course but an initial step in a rather long path toward pervasive Internet encryption, but it would be an immensely important one.

TLS is not a total panacea by any means. In the absence of prearranged user security certificates, TLS is still vulnerable to man-in-the-middle attacks, but any entity attempting to exploit that approach would likely find themselves in significant legal difficulty in short order.

Also, while TLS/https: would normally deprive ISPs -- or other intermediaries along the communications path -- of the ability to observe or modify data traffic contents, various transactional information, such as which Web sites subscribers were visiting (or at least which IP addresses), would still be available to ISPs (in the absence of encrypted proxy systems).

Another potential issue is the additional computational cost associated with setting up and maintaining TLS communication paths, which could become significant for busy server sites. However, thanks to system speed improvements and a choice of encryption algorithms, the additional overhead, while not trivial, is likely to at least be manageable.

Weinstein raises a number of issues here, namely:

In this post, I want to address the second and third issues. MITM attacks deserve their own post. First, we need to be clear on what we're trying to do. The property the communicating parties (the client and server) want to ensure isn't that third parties can't read (the technical term here is confidentiality) the traffic going by but rather they can't modify it (the technical terms here are data origin authentication (knowing who sent the message) and message integrity (knowing that it hasn't been modified)). Obviously, there's no way to stop your ISP from sending you any data of his choice, but you can arrange to detect that and reject the data.

The general way that this is done is to have the server compute what's called a message integrity check (MIC) value over the data. The server sends the MIC, along with the data to the client. The client checks the MIC (I'm being deliberately vague about how this works) and if it isn't correct it knows that the data has been tampered with and the client discards the data. The way this works in TLS is that the client and the server do an initial handshake to exchange a symmetric key. This key is then used to key a message authentication code (MAC)1 function which is used to protect individual data records (up to 16K each).

So, going back to Issue 2, TLS actually provides confidentiality and message integrity/data origin authentication separately. In particular, there are modes which provide integrity but not confidentiality (confidentiality without integrity is only safe in some special cases so these modes aren't provided)—the so-called NULL modes. So, it's quite possible to arrange matters in such a way that intermediaries can inspect the traffic but not modify it. Of course, whether this is desirable is a separate issue, but I think it's pretty clear that many enterprises, at least, want to run various kinds of DPI engines on the traffic going by. Indeed, they want to so much that they deploy solutions to intercept encrypted traffic, so presumably they would be pretty unhappy if they couldn't see any Web traffic.

There are at least two major difficulties with providing a widely used integrity-only version of HTTPS. The first is that clients don't generally offer to negotiate it, at least in part because it's easier to just have users expect that HTTPS = the lock icon = security than to try to explain the whole thing about integrity vs. confidentiality. This brings us to the second issue, which is how we provide a UI which gives users the right understanding of what's going on. More on the UI issue in a subsequent post, but it should be clear that from a protocol perspective this can be made to work.

Moving on to the performance issue: HTTP over TLS is a lot more expensive than raw HTTP [CPD02]. So, TLS-izing everything involves taking a pretty serious performance hit. The basic issue is that each connection between the client and the server requires establishing a new cryptographic key to use with the MAC. This setup is expensive, but it's a more or less fundamental requirement of using a MAC because the same key is used to verify the MAC as to create it. So, in order to stop Alice from forging traffic to Bob from the server, Alice and Bob need to share different keys with the server. The situation can be improved to some extent by aggressive session reuse, thus amortizing the cost of the really expensive public key operations. Client-side session caching/TLS tickets can help here to some extent as well, but the bottom line is that (1) there's some per-connection cost and (2) it breaks proxy caches, which obviously puts even more load on the server.

One approach that doesn't have this performance drawback is to have the server authenticate with a digital signature. Because different keys are used to sign and verify, a single signed message can be replayed to multiple recipients. This reduces the load on the server, as well as (if the protocols are constructed correctly) working correctly with proxy caches. Obviously, this only works well when the pages the server is serving are exactly identical. If each page you're generating is different, this technique doesn't buy you much (though note that even dynamic pages tend to incorporate static components such as inline images.) Static signatures of this type were present in some of the early Web security protocols (e.g., S-HTTP) but SSL/TLS is a totally different kind of design and this sort of functionality would be complicated to retrofit into it at this point.


1. Yes, this whole MIC/MAC thing is incredibly confusing. It's even better when you're doing layer 2 communication security and MAC means the Ethernet MAC.

Posted by ekr at 7:45 PM | Comments (6)

December 3, 2007

FIPS certification pays off

OpenSSL has a FIPS-140 validated module. One of the requirements is self-testing of the PRNG. Unfortunately, it somehow doesn't quite work
A significant flaw in the PRNG implementation for the OpenSSL FIPS Object Module v1.1.1 (http://openssl.org/source/openssl-fips-1.1.1.tar.gz, FIPS 140-2 validation certificate #733, http://csrc.nist.gov/groups/STM/cmvp/documents/140-1/140val-all.htm#733) has been reported by Geoff Lowe of Secure Computing Corporation. Due to a coding error in the FIPS self-test the auto-seeding never takes place. That means that the PRNG key and seed used correspond to the last self-test. The FIPS PRNG gets additional seed data only from date-time information, so the generated random data is far more predictable than it should be, especially for the first few calls.

This vulnerability is tracked as CVE-2007-5502.

There's no real deep lesson here. This is the kind of mistake anyone can accidentally make. It's true that the more options you have in a piece of code, the higher the chance that there will be a some code path that doesn't work right, and in this case it's particularly striking because (1) there's no need to self-test a software PRNG 1 and (2) it's the addition of the self-test that broke it, but it could have easily have been something else.

1. In general, self-testing any cryptographic PRNG is difficult. The standard way to build a CSPRNG is to take whatever your entropy source is and run it through a bunch of hash functions. The result of this is that the output looks random under standard entropy tests. This is true even if the seed is very low entropy. All a self-test really means is that the hashing part of the PRNG is working correctly, but usually it's the seeding part that goes wrong (as seen here).

Posted by ekr at 12:46 PM

November 24, 2007

OMG, you mean VoIP is tappable?

For some reason, Peter Cox's SIPtap program is getting press. First, it's immediately obvious to anyone with even minimal knowledge of networking that if you have access to the packets of a VoIP flow (or for that matter any other unencrypted network flow), you can reconstruct the data. That's why people use encryption. So, this is hardly news. That's why the IETF and others have spent a lot of time building security protocols for VoIP. Many current VoIP phones come with some encryption now and the newer stuff will be more secure and easier to deploy.

OK, so it's common knowledge. On the other hand, Cox doesn't say he discovered it, just that this is a "proof of concept". Given that it's droolingly easy to write an RTP decoder and that VoIPong and Vomit and Wireshark already existed, it's hard to see exactly what concept is being proved, other than that with enough hype you can get your name in the paper.

UPDATE: Fixed typos

Posted by ekr at 8:27 PM | Comments (1)

November 23, 2007

Combined software and services and wiretapping

For obvious reasons, law enforcement and investigative agencies aren't incredibly fond of encrypted communications. The most popular responses to this difficulty have generally been one or more of:

None of these have been particularly successful: the strong crypto cat is out of the bag, users have overwhelming rejected key escrow, and although people do sometimes have their keys subpenad (the UK has a law requiring complaince), there are standard cryptographic techniques that provide "perfect forward secrecy" so that even if your keys are disclosed after the fact your communications aren't readable. The government in the US has had some success with keyloggers, spyware, etc., but they either require physical access or compromise of the system in question.

The popularity of combined software/service operations like Hushmail and Skype opens up a new avenue, however. It's recently come out that Hushmail has in the past handed over keys to the government for users who used their online encryption system. This was made easier by Hushmails "software as a service" type architecture, where they do the encryption and decryption on their site. Hushmail also provides an option where you can download a Java applet, but it should be clear that under the right legal constraints, they could theoretically put a backdoor in the applet you downloaded, too.

Similarly, the German police have recently complained that they can't monitor Skype calls. They say they're not asking for the encryption keys, but because of Skype's architecture and the fact that Skype is involved in authenticating each call, it should be clear that Skype could mount a man-in-the-middle attack on your phone call and hand over the keys. They could also just give you an "upgraded" software version with a back door.

Combined software/service systems like Skype and Hushmail are uniquely susceptible to this kind of lawful intercept attack (or for that matter to cheating by the vendor of any kind.) If you use third party software than you don't have to worry about your ISP cheating you because they can't—they don't have the keys. And while your software vendor could potentially cheat you, they don't have the kind of constant contact with you that Skype or Hushmail does, so they would generally need to put a back door in every copy of the software, which carries a much higher risk of discovery and of users switching software. Who wants to run software with a deliberate back door?

Posted by ekr at 5:33 PM | Comments (6)

November 12, 2007

My best line of the day

Steve Burnett is giving an intro to crypto talk in which he explains that "cryptography is about turning sensitive data into gibberish in such a way that you can get the sensitive data back from the gibberish".

My observation: "This differs from standardization, where you can't get the sensitive data back from the gibberish."

Posted by ekr at 10:16 AM

November 5, 2007

Voting talk at Stanford Security Seminar

I'll be speaking tomorrow at the Stanford Security Seminar:
Some Results From the California Top To Bottom Review

Eric Rescorla

In Spring of 2007, the California Secretary of State convened a team of security researchers to review the electronic voting systems certified for use in California. We were provided with the source code for the systems as well as with access to the hardware. Serious and exploitable vulnerabilities were found in all the systems analyzed: Diebold, Hart, and Sequoia. We'll be discussing the effort as a whole, providing an overview of the issues that all the teams found, and then discussing in detail the system we studied, Hart InterCivic.

Joint work with Srinivas Inguva, Hovav Shacham, and Dan Wallach

If you want to listen, heckle, whatever, it's at 4:30

Posted by ekr at 11:36 AM | Comments (1)

October 29, 2007

Traffic blocking evasion and counter-evasion

When faced with a traffic blocking (or, as Comcast calls it, delaying) scheme like Comcast is using for BitTorrent, it's natural to ask how to evade the blocking. At a high level, there are two possible strategies:

Resilient traffic
The blocking strategy Comcast is using is to forge TCP RST packets to kill the connection. The advantage of this strategy is that it's cheap; you don't need to touch any of the routers at all. Whatever packet inspection box you're using just sources a single packet to each sender, and the ordinary TCP mechanisms shut down the connection. By contrast, if you actually wanted to stop the packets from flowing the traffic you'd need to insert transient filters into some router, which isn't necessarily that convenient, especially if we're talking about a high-speed core router.

The good news from the perspective of the communicating parties is that this leaves open a window to evade the blocking. Richard Clayton observes in the context of the Great Firewall of China that if the peer implementations simply ignore TCP RSTs then they can't be blocked via this mechanism. Unfortunately, this interferes with TCP's normal operation, since RSTs do something useful. Worse yet, there are lots of other ways to interfere with a TCP connection. For instance, the attacker could forge FIN packets, simulating a normal close. Alternately, he could send fake data segments, breaking the protocol parsing at the next level up. The bottom line here is that TCP was not designed to be DoS-resistant, especially from an on-path attacker. That's not straightforward to fix, especially by this kind of crude modification to the TCP stack.

That doesn't mean that it's not possible to make the traffic resilient to blocking. The standard approach here is to use cryptographic message integrity/data origin authentication to prevent the attacker (Comcast) from inserting their own traffic into the connection. Unfortunately, this can't be done at the appplication layer above TCP (e.g., SSL/TLS)—the attacker is attacking the TCP layer and security protocols like SSL/TLS depend on correct functioning of the lower layer protocol to function correctly. In fact, SSL/TLS makes the problem worse, since any interference with the ciphertext causes integrity failures and connection failure. (This is the same reason why SSL/TLS can't be used to fix the BGP TCP connection reset issue.)

In order to be resistant to this kind of injection, you need to have message integrity at a layer below TCP. The standard solution here is to use IPsec, but you could also use a datagram transport protocol layered over DTLS. The important thing is that the traffic has to be authenticated below the connection management state machine.

Of course, all of these schemes can be blocked if you're just willing to inject filters into the router. The good news from the attacker's perspective is that these connections are long-lived and so you don't have to inject the filters that quickly. You also don't need to get every packet—TCP uses packet loss as a congestion signal and backs off, so if you can achieve an even modest packet loss rate, you can have a dramatic impact on the performance of the connection.

Hiding Traffic
The other major strategy is to stop whatever deep packet inspection (DPI) engine the ISP is using to detect filesharing traffic. The idea here is that the ISP only wants to block some of your traffic, since they want you to be able to use your Internet connection for other applications. So, you just need to stop selective blocking.

The natural way to do this is to use encryption. Even an encryption protocol like SSL/TLS that is above TCP does a reasonable job here, since it hides the application traffic. Interestingly, BitTorrent encryption doesn't seem to help here. I don't really know any details of BitTorrent's encryption, but presumably the issue is that there are unencrypted protocol elements that are specific to BitTorrent and so the DPI box can still do some traffic analysis.

Even with a generic protocol like TLS, the attacker can still do a fair amount of traffic analysis based on timing, packet sizes, etc. You also get the TCP port. BitTorrent doesn't use a single fixed port for data connections, so the attacker can't just block that port. However, the port range is somewhat predictable and doesn't overlap with ports for other popular protocols, so if you see a lot of data flowing on one of the potential BitTorrent ports, it's a good guess that it's BitTorrent. Note that if you use IPsec, then you can hide the ports from the attacker, but the packet size, timing information, etc., is still available.

The counter-countermeasure to this kind of traffic analysis is to send deliberate "cover traffic". When you want to send real traffic you just substitute it for some of the cover traffic. Of course, to do this well you need to chew up a lot of bandwidth on the cover traffic, which is unfriendly and hard on the rest of your performance.

Summary
The bottom line here is that an attacker who controls your Internet connection can always guarantee that you can't use it. The best you can do is make it hard for the attacker from selectively blocking some of your traffic and leave the rest of it alone.

Posted by ekr at 10:53 PM | Comments (8)

October 9, 2007

So much for hiding your face

Redacting digital information turns out to be a tricky proposition, at least if you go by how often people screw it up. The usual situation is some declassified document where the government has just put easily removed black boxes over the relevant text, but in this case it's an individual who made the mistake, a certain alleged pedophile who posted incriminating pictures. of himself with his face obscured by the Photoshop twirl filter. Unfortunately for him, it turns out that this effect is reversible:
Apparently, the suspect, or whoever handled the pictures, did not think it was possible to reverse the twirling, a capability that at least one Interpol official was intent on keeping confidential.

Now the cat is out of the bag. Officials are declining to say just how they did it, leaving Interpol in the strange position of urging the public to help find one pedophile suspect while refusing to divulge a tool that might identify others before they hear today\u2019s news and rush to delete potentially incriminating twirled images of themselves.

By publishing the untwirled photos of their suspect today, the international police organization also decided to risk the possibility that the man -- or men who happen to look like him -- may face violence from vigilantes.

Apparently, this effect is really trivial to reverse. According to this BoingBoing post you can just set the twirl filter to negative and you get the original picture back. Obviously, there are transformations which would be more complicated to reverse; for instance you could encrypt the relevant pixels or randomly permute them, though any fixed transformation which is one-to-one and onto should be reversible with enough effort. In addition, there are transformations which destroy information and are partly or wholly irreversible. The obvious case is replacing the relevant pixels with pixels all of the same color. This is of course simple, but apparently not as obvious as you might think.

Now where things go wrong with a lot of redaction operations—especially with formats like PDF—is that the basic formats are more complicated than bitmaps. The redactors just create a new black object that is in front of the the text to be redacted. The underlying information is still there, so it's just a matter of removing/ignoring the black object and you have the original text. It's much safer to work with a bitmap format where you know that you're changing the relevant pixels rather than just masking them. Of course, you also need to use a transformaton that actually can't be easily reversed.

Posted by ekr at 10:47 PM | Comments (2)

September 23, 2007

How do you lock down an iPod?

Apparently the iPod SHA-1 thingamajig has been reverse engineered. As I said earlier, I'm not convinced that this actually was intended to lock down the iPod. However, it's interesting to ask how one would actually do that in a way that was harder to reverse engineer.

Two goals were ascribed to the alleged SHA-1 in the database:

If all you have is a hammer, everything looks like a nail, and if you're a COMSEC guy, problems like this bring crypto to mind. At a high level, there are two cryptographic strategies for this kind of job: encrypt the database which is then decrypted by the iPod/iTunes or apply an integrity check which is checked by the iPod/iTunes. Each of these have advantages in some contexts, but we can treat them mostly the same for the purposes of our discussion, so without loss of generality, let's talk about an integrity check.

The difficulty, as with most cryptographic contexts, is key management. We want to make sure that only legitimate copies of iTunes can produce databases that the iPod can verify, which means that iTunes has to contain a key that isn't known to third party developers. There are two options here: all copies of iTunes have the same key—this is basically the same as a fixed, secret, integrity check function or one over unknown data, i.e., the situation we have now. Any system of this type is very vulnerable to key extraction via reverse engineering. Once you have the key (or the function) you can write your own program.

The other approach is to use a separate key for each copy of iTunes. When a new iPod is attached to iTunes, it gets a copy of the key (imprinted). The most attractive mechanism here is probably to use public key cryptography and put the public key on the iPod. The key can even be signed by Apple to avoid false imprinting. Then all database updates are signed and the iPod verifies them. Of course, you can still mount a reverse engineering attack and extract the key from a single copy of iTunes, but then we're in an arms race where Apple can program new iPods to ignore that key, thus forcing the third-party software authors to constantly change keys.1

Another strategy for the attacker is not to extract a single key but rather to have the third-party software extract keys from a valid copy of iTunes, though this is obviously this is a bit inconvenient if you don't want to be involved with Apple's software at all.

If this sounds like the kind of issues you have with DRM, it is. And like DRM, the attacker has an enormous advantage as long as your system is software only and he's prepared to reverse engineer it. The situation changes a lot if you are willing to have trusted hardware (in this case on the host computer) but that would be a big change for Apple.

1. If Apple is willing to force people to register online, you can make detection and revocation of extracted keys much more efficient.

Posted by ekr at 9:17 PM

August 22, 2007

SHA-1 rumors

I'm not at CRYPTO but my sources tell me that there may have been some more progress on SHA-1 and that the latest estimates are on the order of 260.x. Anyone with more details please post them in the comments.

Posted by ekr at 6:36 AM | Comments (2)

August 6, 2007

Fail-what?

In my previous post about SWORDS robots, I referred to "fail-safe" and "fail-unsafe" strategies. Now, clearly, if you're a civilian in the line of fire of a killer robot, you'd think a strategy in which the robot shut itself down when it couldn't communicate with base to be "safe", you might feel a little differently if you were a soldier who had to go out into enemy fire because a minor communication glitch caused your robot to shut down.

As another example, take a system like Wireless Access in Vehicular Networks (WAVE), which provides for communications between vehicles and between vehicles and road-side units. WAVE can be used for safety messages, such as the Curve Speed Warning message, which allows a station at the side of the road to broadcast the maximum safe speed for a given curve. Obviously, you'd like there to be some message integrity here to prevent an attacker from broadcasting a fake speed. Now, what happens when the integrity check fails; do you ignore the message?

A decent argument could be made that either ignoring or trusting such messages was "fail-safe". Obviously, ignoring them appears safe in the sense that your vehicle reverts to what it was without the WAVE functionality, so you haven't been damaged. On the other hand, the curve speed warning is designed to help safety (that's why it's being broadcast) so ignoring it is arguably failing unsafe! I don't really have a position on what's right or wrong here, but it should be clear that the terminology is confusing.

I've heard people substitute the terms "fail-open" or "fail-closed", but those are even worse. If you're an electrical engineer, a closed circuit means current flows and an open circuit means current doesn't. On the other hand, an open firewall means that data flows but a closed one means it doesn't.

I don't know of any really good terms, unfortunately.

Posted by ekr at 8:45 PM | Comments (5)

August 5, 2007

Oh good, a kill switch

Wired reports that the DoD has taken delivery of three "special weapons observation remote reconnaissance direct action system" (SWORDS) robots. (Pretty tricky with those acronyms, guys!). Anyway, these are remote-controlled robots armed with M-249 machine guns.

Apparently these robots were uh, a bit flakey, but the manufacturers say they've got all the bugs worked out now:

The SWORDS -- modified versions of bomb-disposal robots used throughout Iraq -- were first declared ready for duty back in 2004. But concerns about safety kept the robots from being sent over the the battlefield. The machines had a tendency to spin out of control from time to time. That was an annoyance during ordnance-handling missions; no one wanted to contemplate the consequences during a firefight.

So the radio-controlled robots were retooled, for greater safety. In the past, weak signals would keep the robots from getting orders for as much as eight seconds -- a significant lag during combat. Now, the SWORDS won't act on a command, unless it's received right away. A three-part arming process -- with both physical and electronic safeties -- is required before firing. Most importantly, the machines now come with kill switches, in case there's any odd behavior. "So now we can kill the unit if it goes crazy," Zecca says.

OK, so ignoring the wisdom of starting from a platform which used to "spin out of control", I'm sort of interested in how the "kill switch" works. As far as I know, there are two basic ways to build a system like this:

It should be pretty clear that if what you think there's a high likelihood that the robot's going to go nuts and you want to minimize the chance that it kills your own people, random civilians, their pets, etc., you probably want something that fails safe. This is especially true in view of the implication in this article that signal strength isn't always what you might like. You really don't want to have a situation where the robot is busy slaughtering innocent bystanders and you can't shut it down because your control unit is showing zero bars.

On the other hand, a fail-safe system is also much easier to DoS—it's probably more important when the system being DoSed is shooting your enemies than when it's serving up copies of Girls Gone Wild. All the attacker has to do is somehow jam your signal (and remember that since you probably want to have a cryptographically secured control channel, they only need to introduce enough errors to make the integrity checks fail). This makes the problem of designing the control channel a lot more difficult. I'd definitely be interested in hearing more about the design of the protocol for these gizmos.

Posted by ekr at 10:32 AM | Comments (4)

August 3, 2007

What I did on my summer vacation

For the past couple months I've been spending most of my time working on California's Top-to-Bottom Review of electronic voting systems certified for use in California.

The overall project was performed under the auspices of UC and led by Matt Bishop (UC Davis) and David Wagner (UC Berkeley), who did a great job of negotiating a wide variety of organizational obstacles to get the project going and keep it on track.

This project reviewed the systems of three manufacturers:

Each company makes both an optical scanner for paper ballots and a computerized direct recording electronic (DRE) (these are often called touchscreen, but the Hart system actually uses a clickwheel), as well as a back-end election management system.

Each system was assigned to three teams:

There was also an accessibility team for all the systems.

I led the Hart source code team, consisting of me, Srinivas Inguva, Hovav Shacham, and Dan Wallach, and sited at an undisclosed location which can now be disclosed as SRI International in Menlo Park. Our report was just published yesterday, just ahead of the statutory deadline for the State to decide on whether these systems will continue to be certifed (more detail here). You can get it here and all the reports here.

I wasn't planning on saying much about this on EG. Most of what I have to say is already said better in our report. I did want to say a word about my team, who put in extraordinary amounts of effort under an extremely tight timeline; just over a month from the time we got the Hart source to the delivery of the final report. Thanks, guys, and I look forward to working with you again, hopefully next time in a room with 24x7 air conditioning.

Posted by ekr at 9:00 AM | Comments (1)

June 13, 2007

More on collisions and APOP

Paul Hoffman asks (in comments):
Is this really easier than a dictionary attack after one unsuccessful attempt? I guess that this attack works when the APOP password is not in any attack dictionary or algorithm, but I would still like to see a comparison between the work effort for the attack and a very deep dictionary run. Note that the dictionary attack is *much* less likely to raise suspicions since there is only one failure, not many.

So, the first thing you need to realize is that this is a byte-by-byte attack. For simplicity, ignore dictionary attacks and assume that you have a 64-bit password. Searching that entire space takes, you got it, 264 operations (all offline). Now, say you mount the Leurent-Sasaki attack on only the last byte. This requires intercepting average 256 (worst case 512) connections and finding a collision for each of those connections. It seems to be a little hard to map the cost of finding a collision directly onto hash operations, but Luerent reports about 5 seconds per collision, so we're probably looking at order 100,000 (216) operations per collision, so this is something like (216) operations. But look what's happened here: we now know the last byte of the password, so we can mount a search on the remaining bytes. Searching the remaining bytes requires only 256 operations, compared to which 224 is negligible. So, we've reduced our computational complexity by around a factor of 256, admittedly at the cost of intercepting a lot more connections.

If we could extend this technique to the whole password we'd need to intercept about 8*256 connections and do about 8*227== 232 hash computations, so we'd have reduced the work factor by about a factor of two. However, as I mentioned in the original post, this technique can only be used to extract the last three bytes of the password. To make a long story short, Leurent estimates that with 8 character passwords with 6 bits of entropy per password this brings the work factor down to 230, a reduction of 218. This is obviously a big improvement.

Of course, this improvement depends on a fairly unrealistic assumption about the entropy of the password. In general, the lower the entropy, the more attractive dictionary search looks, and with typical passwords, it probably is better, especially when you factor in the negative effects of interfering with the user's connections.

UPDATE: Roy Arends points out that I apparently can't multiply and that 8*24==227. So, no breakeven point, like I'd previously suggested. I guess 8:00 PM must be past my bedtime. Fixed.

Posted by ekr at 8:08 PM | Comments (1)

June 11, 2007

A practical use of collisions

So, I'd been going around saying that the collision attacks against MD5 and SHA-1 were pretty useless against real protocols. At ECRYPT, Dan Bernstein pointed out to me that someone has actually used a collision attack successfully against APOP, an old challenge-response style protocol. The paper, by Leurent, is here (and rediscovered by Sasaki here).

The way that APOP works (like pretty much all challenge-response protocols) is that the server sends the client some challenge (a fresh value) M and the client sends back F(K,M) where K is the shared key and F is some function. In modern systems, people tend to like HMAC for F but APOP was designed before HMAC and in an era where people were pretty loose about hash functions. APOP computes the response as: MD5(M || K). This turns out to enable an attack, provided that the attacker can control the challenge.

The basic attack allows the attacker to determine one character of the password. Say he thinks the last character is C The attacker generates two colliding messages M1 and M2 with a special structure.

So, we have:

Where X and Y are arbitrary and come out of the collision finding algorithm (actually, these are longer, but nobody wants to see 63 xs.).

The attack then requires the user to authenticate twice. The first time the attacker gives the challenge xxxxxxxxxx. This causes the user to return H(xxxxxxxxxx || P1 || P2 || P3 || ... Pn) where P1 is the first character of his password, P2 is the second, etc. The second time the attacker sends yyyyyyyyyy and gets back H(yyyyyyyyyy || P1 || P2 || P3 || ...). Now, here's the key point: these challenges have been specifically arranged so that the first byte (P1) of the password lines up with the last byte of the first hash block. If P1 == C then the two first hash blocks will collide. And since P2 ... Pn are the same, the entire hash output will collide. In other words, if the attacker has guessed C correctly, then the responses to these two different challenges will be the same. Otherwise they will not be (with high probability).

We've now extracted the first character of the password. That's not bad, but what about the rest? Well, it's straightforward to extend this once we know the first character. We build two new messages:

And we can repeat the same procedure extracting the password one character at a time.

This attack strategy has been known for quite some time and is originally due to Preneel and van Oorschot. However, the bottleneck was always that finding collisions was too expensive. The discovery of efficient paths for finding collisions changes that. If it's easy to find collisions, then this method becomes very practical.

Of course, "easy" is where things get complicated. There are two factors to consider here. The first is that it can be difficult to control the collision values, so you don't always get to choose that the last n characters of the colliding blocks are equal. Indeed, Leurent reports that he can only recover the first three bytes of the password. The second complication (and this applies only to APOP) is that in APOP challenges are RFC-compliant message ids, and the colliding blocks above most likely contain non-ASCII characters and so don't fit. Implementations which check for compliance are probably not vulnerable. However, Leurent reports that he's successfully mounted this attack on a variety of clients and it works, which isn't too surprising. Note that improved collision-finding techniques could lead to relaxation of both of these constraints.

The bottom line here is that if you're using APOP without TLS you should probably stop. On the other hand, if you're using APOP without some kind of encryption you should have stopped long ago...

Posted by ekr at 7:20 AM | Comments (2)

May 28, 2007

Notes on the ECRYPT Hash Function Workshop

A few impressions from the EFHW, but first some tutorial material.

For those of you who don't know how hash functions are constructed, the basic idea is iteration. The current hash functions all use a construction called Merkle-Damgard (MD). You start with some compression function F which takes a block M of size m and turns it into a block S of size s where m > s. For SHA-1, these are 512 bits and 160 bits respectively. F also takes as input a state variable of length m. This allows you to chain by taking the output value and feeding into F as the next state.

So, the idea here is that we start with state IS and then compute S0 = F(IS,M[0]). That's the first block. We compute the second block as S1 = F(S0, M[1]) and so on. Once we've processed every block, the final S value is our hash output. [Yes, I know this only works on even block-length messages. There's a padding technique to fix this.] There are other ways to connect up compression functions but given that all the compression functions anyone is seriously proposing can only process block sizes of limited length, pretty much all the hash functions need some chaining mode, whether it's M-D or something else.

So, in order to define a new hash function you need to specify:

As I said earlier, all the major hash functions (MD5, SHA-1, SHA-256, ...) use M-D, but they differ in the compression function. Why use M-D? It's simple and you can prove that if the compression function is collision resistant than so is the final hash construction. Of course, as we've seen with MD5 and SHA-1, if the compression function isn't collision resistant, than the game changes.

So, with that in mind, the workshop:

d

Posted by ekr at 9:45 PM | Comments (2)

May 24, 2007

ECRYPT slides

If you're not already in Barcelona, it's probably too late to catch my ECRYPT IT, "Indigestion: Assessing the impact of known and future hash function attacks". The slides, however, are here.

Posted by ekr at 3:28 AM

May 20, 2007

Steelcape and Firewall Traversing VPNs

A while back, I received a spam with the following contents:
Send data without opening your ports.

Steelcape has taken an innovative approach to security, instead of trying to repair TCP/IP, we have built a solution inside the TCP/IP protocol. This method allows Steelcape to secure environments without opening ports on the firewall.

The packets are encrypted at 256 bits and signed with a 48 bit digital signature. Also works with IPv6. Please take a look at our website www.steelcape.com

This rang in at about 10W40 on my snake-oil-ometer, but seeing as they'd been kind enough to give me their information, I figured I'd check it out and if it was snake oil, make fun of them publicly.

First, you can check out their Web site, which reiterates their basic claims:

This doesn't really tell you how the thing works, though. Points 2-4 are easily understood, but what does point 1 mean? Not opening any ports? It's not TCP/IP? How does the data get through? Figuring that out required reading their somewhat confusing WP (link omitted because you don't want to give them your name and email address any more than I do), an exchange of emails, and finally a con call with one of the sales/marketing guys and some technical guys.

Making sense of this requires some background on how Enterprise networks, firewalls, and VPNs work. Figure 1 shows the world's simplest firewalled Enterprise network.


Figure 1: A basic firewall configuration

You've got a bunch of hosts on an internal network separated from the Internet via a firewall, which also doubles as the Internet access router. The firewall prevents all connections to the internal hosts from the Internet. Defining what an internal connection is turns out to be a bit tricky, so, let's just talk about TCP here. Your classic firewall forbids incoming TCP connections but allows outgoing ones, at least to some ports. This works fine if you only want to have client machines, but if you want to have servers (e.g., a Web server), you need to do something else. To a first order, you can either punch a hole in your firewall or put the mail server outside your firewall. Actually, people often do both; they run two firewalls, with the web server in between them in what's called the DMZ. The outside firewall has a hole punched in it to allow access to the Web server, but because it's outside the inner firewall there's no need to punch a hole there. Figure 2 shows what I'm talking about.


Figure 2: Firewall with DMZ

This DMZ strategy of course works less well if you want to allow VPN access to your network, since the whole point of a VPN is to allow remote—albeit secured;access to the internal network. Either you make the firewall and VPN access router one and the same or you place the VPN access router inside the firewall, i.e.,


Figure 3: Firewall with VPN router

And, of course you need to have a hole in the firewall in order to let packets to/from the VPN router.

So, this gives us the background for understanding the "no holes in the firewall" claim. The first thing you need to understand is that this is a VPN-only solution: you can't use it to secure access to your public mail server or Web server. Those need to have open firewall ports. So, the idea is that you install Steelcape's stuff on both sides of the system, for instance as an appliance on your home network and then on the laptops of your remote users.

But I've just said that the VPN server needs to be able to talk directly to the Internet. So, how does Steelcape work? There WP says:

In TCP/IP, data is sent though ports, over a network, to a firewall or other network device, and sent on to listening hosts within a businesses infrastructure. This a potential point of exposure, since unwanted packets could end up on host systems. Steelcape leaves hostile data behind at the firewall as enabled by a "pull" architecture. Steelcape-enabled hosts pull packets from the firewall, validate them, and route them on to host systems. If Steelcape does not qualify the packets, they are kept at arm's length from the system and dropped at the firewall. This is accomplished without any sacrifice network bandwidth.

This description sounds initially coherent, but doesn't make sense in light of the claim (in e-mail) that it works with commodity firewalls, which don't have any such buffering or pull built into them. You can't really get this out of the WP, but it turns out that what's going on is that you have a topology like that shown in Figure 4.


Figure 4: Basic Steelcape installation

Whereas your classic VPN installation requires one box at each site, the Steelcape design requires two, one inside and one outside the firewall. The way that this works is that the gateway has a permanent connection tied up to the enterprise server (ES). This connection traverses the firewall. When a remote user wants to VPN in, he contacts the ES and authenticates. The ES sends a message to the gateway over this permanent signalling channel telling the ES that a client wants to come in and the client's IP address. The gateway then uses NAT/firewall hole-punching techniques (like ICE but the Steelcape guys say it's not ICE) to let the remote agent talk to the VPN server (it helps here that Steelcape pushes their traffic over UDP). If this strategy sounds familiar to you, it should. It's exactly the topology used by VoIP systems, with the SIP proxy taking the place of the ES.2 It should also now be clear why this can't work with generic Internet services: they're not set up to contact Steelcape's ES.

This approach has two claimed advantages: security and ease of installation (the other claimed advantages of Steelcape's stuff come from different techniques). In terms of security, it doesn't require that the VPN server be accessible from arbitrary Internet hosts all the time. Holes only get punched for hosts which have been authorized by the ES. Think of this as decreasing the surface area of attack. Of course, the flip side of this is that the ES does have to be accessible from any host. However, it's true that in order to attack the internal network you do need to compromise two hosts instead of one, so if implemented properly this could give you a measure of defense in depth. I'm not really confident it's that enormous, but all other things being equal (i.e., Steelcape's software being equally secure to other people's software), it probably does give you a measure of defense in depth.

In terms of ease of installation, it's certainly true that you should be able to install a Steelcape system without the cooperation of the firewall administrator. This is of course the kind of feature that users love and firewall administrators hate, because (from their perspective) it's far too often used to bypass enterprise security policies.1. Certainly, I would think that most enterprises would want any VPN appliances to have the approval of the firewall admin, so getting him to punch a hole hardly seems that onerous. And on the downside, of course, now you have two machines to maintain, which increases your maintenance effort.

The other major claim, increased performance, derives from Steelcape's use of a proprietary protocol which is allegedly faster than TCP. It was a little hard to work out what the optimizations were, but it sounded like principally it was that they didn't use compression and maybe that they had more aggressive congestion control3 It's certainly the case that you can get improved network performance via link-layer compression, so that sounds plausible, though hardly proprietary. That can of course be done within standard TCP/IP (see IPComp) so there's not likely to be anything proprietary there. And of course as soon as you are doing TCP/IP translation on Steelcape's boxes rather than TCP end-to-end there's lots of opportunity for things to go wrong. So, the "replaces TCP/IP" stuff looks mostly like marketing special sauce to me.

The other thing that probably should make you seriously nervous here is that Steelcape isn't using a standard security protocol (e.g., IPsec, SSL/TLS) but rather something they designed themselves. I didn't drill down on this too far, but apparently it uses Blowfish (a fine algorithm, but generally not a sign of crypto sophistication; the pros use AES) with (according to their product overview) a "48 bit digital signature") which is "randomized every few milliseconds". Since the security of a VPN system fundamentally depends on the crypto in use, using an unevaluated protocol isn't exactly confidence inspiring.

It should be obvious here that it's possible to design a standards-based system that uses the same firewall/NAT traversal techniques used by Steelcape. That would have some advantages and disadvantages vis-a-vis ordinary VPN approaches, plus you'd have a high level of confidence that the security protocol was secure. It doesn't look to me like that's what Steelcape is providing, however.

1. See also draft-saito-mmusic-sdp-ike-00 for a very similar standards-based approach.
2. It's interesting to note that in this case there's no requirement that the ES actually be in the DMZ. It could be some entirely different third party server on the network—after all, this is how SIP works—which puts a somewhat different spin on the above security argument.
3. They also told me that they eliminate the TCP checksum computation so that this sped up intermediate routers. As far as I know, checksum computatations are not a significant part of packet processing overhead.

UPDATE: Ed Felten hypothesizes that the 48-bit "digital signature" is actually a MAC. Another possibility is that it's not data dependent at all; it's just a secret value that gets appended to the packet. That would be consistent with the "randomized every few milliseconds" claim, since a MAC would be different for every packet. Needless to say, if that's what it is, it's not adequate.

Posted by ekr at 8:37 AM | Comments (2)

May 13, 2007

Encryption and wiretapping

Bellovin writes:
Those who remember the Crypto Wars of the 1990s will recall all of the claims about "we won't be able to wiretap because of encryption". In that regard, this portion of the latest DoJ wiretap report is interesting:
Public Law 106-197 amended 18 U.S.C. 2519(2)(b) to require that reporting should reflect the number of wiretap applications granted for which encryption was encountered and whether such encryption prevented law enforcement officials from obtaining the plain text of communications intercepted pursuant to the court orders. In 2006, no instances were reported of encryption encountered during any federal or state wiretap.

The situation may be different for national security wiretaps, but of course that's where compliance with any US anti-crypto laws are least likely. There was no mention of national security or terrorism-related wiretaps in the report, possibly because they've all been done with FISA warrants.

This is interesting data, but consider if you will the contrary interpretation: encrypted telephony has seen practically no deployment. During the crypto wars it was widely believed that if the government just got out of the way encryption would become ubiquitous. But export controls were loosened in 1999 and that still obviously hasn't happened. The one exception, of course, is that mobile communications are often encrypted for transmission over the air interface, but (1) they're not end-to-end encrypted so you can wiretap at the junction with the PSTN and (2) the algorithms have historically been quite weak. So, all this really does is impede wiretapping by RF collection, which is an issue for intelligence agencies but not really for law enforcement, which can just serve a warrant on the mobile provider. So, who won the crypto wars again?

Posted by ekr at 7:55 AM | Comments (3)

May 1, 2007

One-time passwords for e-banking

As I've mentioned previously, simple password systems suffer from a variety of capture and replay attacks. All the best solutions involve cryptographic authentication but this involves changing both client and server, which is a pain (the client is the real problem). One of the early approaches to this problem was to give the users physical cryptographic tokens which they could use to generate a supplemental password. This had the big advantage that it could be deployed without changing the user's client software.

The most succesful of these tokens was the SecurID card (now part of RSA). A SecurID is a card with an LCD display that generates a new "random" value every 30 seconds. The server side is synchronized to the token and so can verify the token value. I mention this because VeriSign is introducing a credit card with a similar technology, aimed at login for e-banking.

Rosch explained that, at this point anyway, the cards would not be geared toward online retailers, like Amazon.com. Instead, they're aiming the concept at businesses and consumers who set up online accounts, like banks, brokerages and PayPal.

The cards would hold an algorithm that could generate the six-digit passwords, which are only good for 30 seconds.

When a consumer wanted to log onto her online banking account, for instance, she would log on with her user name and password, as usual. Then the site would ask for her secondary password. She would press a button on her credit card and the numerical password would flash up on the LCD screen. The next time, she needed to log into her account, her card would give her a different number, which the site would match up with the card's unique serial number, which corresponds to the algorithm it uses.

...

Rosch added that even if a key logger planted surreptitiously on the user's computer picked up that second password, a hacker wouldn't be able to use it because subsequent transactions would require a different password.

This certainly seems reasonable as a phishing countermeasure. Web login and telnet are similar in a number of respects—in particular in the desire to avoid touching the client. It shouldn't be too hard to train users to type in the password from the LCD, given that it's going to appear more or less next to the credit card number. My main concerns would be the cost and durability of the cards. Durability is especially an issue. I have no particular information about durability of the ICT cards, but a card with electronics and an LCD, plus a magstripe seems likely to fail more often than just a magstripe. Even if the cards can be made as cheap as mag stripe cards, if they break a lot that means a lot of calls to customer service to get new cards, and customer service calls are expensive.

I note that they don't seem to be targeting this towards online retailers. That's not surprising since getting that right seems a lot harder. Online retail is a substantially less close fit for the password login model. In particular, merchants want to be able to both batch transactions and retain the credit card numbers for future transactions (think Amazon one-click). Obviously either of these is inconsistent with at least naive implementations of very short-term authenticators. This isn't to say it's not possible to make something along these lines work for credit cards, just that it's more complicated.

Posted by ekr at 10:21 PM | Comments (6)

April 9, 2007

Key equivalence III: Asymmetric Techniques

As I mentioned earlier, all symmetric authentication mechanisms have some level of key/password equivalence. However, this can be removed with asymmetric (aka "public key") techniques.

The simplest such technique is very much like a challenge/response technique except that the response sent by the AP is a public key digital signature. The way that this works is that the VP knows the AP's public key but not the AP's private key. The VP then provides the AP with a random challenge and the VP returns: Sign(Kpub, Challenge.1. This technique is what's used in SSL client authentication and in the SSH RSA and DSA modes (sometimes used with certificates). A related technique is to have the AP have a encryption/decryption key rather than a signature key and have the VP encrypt a message under that key (this is how SSL server authentication often works).

So, if these techniques are so great, why aren't they used all the time? There are a number of reasons:

  1. Public keys are hard to work with. They require the AP to have a public key pair stored on a disk somewhere (impairing portability) and requires some way to carry a fairly heavyweight data object (~100 bytes of binary data) to the server. People put up with this for SSH but they don't like it.
  2. It doesn't provide mutual authentication (and inherently can't because the server only has the public key). Obviously, you can have the server have a public key pair as well, but that makes the key management problem even more annoying.

The first problem is soluble, at least in part if you're willing to trade in some security. The idea here is that you generate your key pair by hashing your password. Then the AP doesn't need to store a public key (again, assuming that you have some other method of authenticating in the other direction). The security tradeoff here is that an attacker can now mount a dictionary attack on your communications with the server unless they're encrypted. He just captures a transcript and then keeps generating trial passwords until he finds one that generates the right private key and hence signature. Of course, this problem already existed with challenge/response-based password mechanisms so the problem hasn't been made any worse.

We can also improve the problem of the key that gets copied to the VP by storing a hash of the key rather than the full key. The AP then needs to provide the public key which can be compared to the hash. This brings the size of the data stored by the VP down to about 128-160 bits. An alternative is to simply give the VP the password on initial registration and let him compute the private/public key pair and then "forget" the password. This obviously needs to be used with some technique to ensure that the private keys are different for multiple VPs even if you use the same password all the time. None of these techniques solve the mutual authentication problem, which needs to be attacked by other means.

This brings us to the topic of zero-knowledge password proofs (also known as password-authenticated key agreement. These protocols use public key techniques to allow two parties who jointly share a password to establish a shared key which is not accessible to any attacker. These protocols can also be constructed so that the VP does not have a password/key equivalent. This differs from the easier-to-understand public key techniques in two important ways. First, they support mutual authentication natively. Second, they don't allow the attacker to mount a dictionary attack on a single connection. For each guess he wants to check he needs to do an online communication with one side. The major remaining attack is that the VP can mount a dictionary attack on his stored value in an attempt to recover the AP's password (though of course the AP/VP terminology is less useful in a mutually authenticated environment). Short of using high-entropy passwords/keys, this attack doesn't seem removable.

1. Actually standard practice is for the AP to provide some randomness of its own, but the attacks where that's relevant aren't that important for understanding the concept.

Posted by ekr at 10:07 PM | Comments (2)

April 7, 2007

Key equivalence II: Symmetric Cryptography

In our previous episode, we talked about key equivalence in physical locks and password systems. As you'll recall, conventional password systems have the problem that the authenticating party (i.e., the user, hereafter called the AP for generality) needs to provide their password to the verifying party (VP, i.e., the server). This has (at least) two bad properties:
  1. An attacker who can intercept your communication with the verifying party or who temporarily controls the verifying party can capture your authenticator (password) when you use it to log in and use it to impersonate you to that verifying party.
  2. An attacker who can intercept your communication with the verifying party or who temporarily controls the verifying party can capture your authenticator (password) and use it to impersonate you to other verifying parties with which you used the same password (and you know you do)

The way to solve the first problem is to have a protocol that allows the AP to prove that they know the password without actually revealing it to the VP. The standard solution to this is what's called a challenge-response protocol. The VP provides the AP with a randomly chosen challenge (technically the challenge just has to be one the VP hasn't used before, but this is almost always chosen randomly) and the AP computes some one-way function of the password/key as the response. The VP stores a copy of the password/key and can thus independently recompute the response. If they match, then the VP knows the AP is who he says he is (or at least knows the password/key).

But wait, last time I said that it was bad for the VP to have the password:

This has a big problem. If someone breaks into the server and gets a copy of the password list they get a copy of everyone's password and can impersonate users. This is what's called a password equivalent or a key equivalent for reasons that will become clear a little later. This lets them leverage a disclosure exploit (i.e., one that lets them read files on a system) into an intrusion exploit (i.e., one that lets them break in or pose as another user). It also means that the password file has to be stored with very strict permissions.

Previously, we solved this problem by storing the hash of the password, but that worked because the AP gave the VP the password to hash. In a challenge-response system the VP needs to independently compute the response. Now, you can of course compute the response based on the password hash rather than the password, i.e., response = F(challenge, H(password)) but that doesn't solve the problem because the VP's password file contains H(password). So, while you don't actually have the password you have a value which is equivalent to it, hence the term password equivalent. Anyone who compromises the password file can impersonate the AP to the VP. So, we've solved the problem of someone intercepting1 the authentication exchange being able to impersonate the AP but we've actually made the problem of password file theft worse.

We can improve the problem somewhat by making sure that each VP has a different password. Then at least you can't compromise one VP and use it to attack another. Of course, it's not practical to believe that people will actually use a different password for each of the 30 web sites they have logins for, but you can solve this problem by hashing in the name of the VP to the stored password. I.e., the VP stores H(VP-name, password)2 and the response is computed using that value as the input. So, if you get at a VP's password file you can impersonate APs to that VP, but not to any other VP. This is an improvement (call it weak password equivalence), but it's not perfect. However, it's the best we can do with symmetric cryptography. In our next installment, we'll see how to improve the situation still further.

1. Well, mostly. An attacker can still mount a man-in-the-middle attack on a single authentication, and then pose as the AP for the duration of that session, but he can't reuse the captured authenticator later. Moreover, this attack can be fixed by binding the challenge-response to a cryptographically protected channel between client and server. One example of this is TLS pre-shared key mode (RFC 4507). 2. Yeah, I'm sure you'd rather use HMAC, but a hash is close enough to get the idea across and is mostly secure in most settings.

Posted by ekr at 4:23 PM | Comments (1)

April 6, 2007

Key equivalence and why it's bad (I)

A British jail is changing all its locks because the keys were shown on TV:
An ITN team mistakenly filmed keys on a visit to Feltham Young Offenders' Institution, West London — sparking fears they could be copied.

It meant the nick's 11,000 locks and 3,200 keys all had to be replaced.

First, I'm fairly skeptical that you can reverse-engineer the keys for a lock based on just seeing the key on TV (and unless the lock is incredibly badly engineered, I don't see how you can do it with the lock) unless it's some extreme close-up shot, in which case it should be easy for the jail to figure out what keys were compromised and just rekey them, rather than the whole jail. Second, keys are just part of the jail defense-in-depth system, so hopefully compromise of keys isn't a disaster. After all, it's not that hard to pick most locks, so you can't count on only the lock anyway.

In general it's not that attractive a property of a security system that just seeing one of the elements allows the attacker to break the system. This is sort of inherent in the construction of ordinary physical locks but even there you could improve the situation a bit by (for instance) putting the beveled sections of the key on the inside rather than the outside so just looking at the key doesn't reveal much information. It's of course harder to cut keys that way with conventional cutting machines but arguably that's a feature since it means that you need specialized equipment to duplicate the keys1, which presents a modest barrier. The bottom line is still that with physical lock systems if you can examine the key (even briefly) or the lock (sometimes quite extensively) you can typically figure out what the key looks like enough to get in.)

In digital security systems, by contrast, we can do quite a bit better. Let's start by talking about a simple password system like you would use to log in to your bank (and like people used to use to log in to their computers back when they were multiuser). The way this works is that you type your password into your Web browser and it's sent over the Intertubes (hopefully encrypted with SSL) to the server on the other side, which needs to check it. The easiest way to do this is to have the server just store a copy of the password locally and do a memory comparison.

This has a big problem3. If someone breaks into the server and gets a copy of the password list they get a copy of everyone's password and can impersonate users. This is what's called a password equivalent or a key equivalent for reasons that will become clear a little later. This lets them leverage a disclosure exploit (i.e., one that lets them read files on a system) into an intrusion exploit (i.e., one that lets them break in or pose as another user). It also means that the password file has to be stored with very strict permissions. The fix for this problem is well known. You don't store the password itself but rather you store a one-way function (originally computed with DES but now typically with a hash function) of the password. Call this H(Password). When the user provides their password you compute H(password) and compare it to the stored value. If they match, the user is in. This scheme has the advantage that compromise of the password file is much less dangerous. In fact, on old Unix systems password files used to be publicly readable until it became clear that you could simply try a bunch of candidate passwords until you got a hash that matched (this is called a dictionary search) at which point we went back to hiding the passwords. Even so, a dictionary search is a lot harder than just reading the passwords off the disk.

Even with this fix, simple passwords have the big problem that if you can convince the user to authenticate to you just once then you know their password (it doesn't help here that users tend to use the same password on multiple sites). This is the basis of both (pre-SSL) password sniffing attacks and of phishing. So, the state we have now is that we can make examining the lock basically useless (as long as people choose really strong passwords) but since authenticating requires presenting a copy of the key, if you can examine the key (e.g., by impersonating the lock) you can impersonate the user as much as you want. This is the state of nearly all Web-based login systems today, but it can can be improved upon quite a bit by some cryptography. I'll get to that next.

1.By contrast, the major security feature on "do not duplicate" keys is often the stamp that says "DO NOT DUPLICATE" (the capital letters are what make it mandatory.2.) Sometimes, but not always the blanks are restricted, but obviously the stamp has nothing to do with that.

2. In this document, the keywords "MUST", "MUST NOT", "REQUIRED", "SHOULD", "SHOULD NOT", and "MAY" are to be interpreted as described RFC 2119.

3. Note to advanced readers, don't bother me about timing analysis. I'll try to write that up later.

UPDATE: In the comments, Chris WalshByrd reminds me that someone actually has copied a Diebold key from a picture on a web site. I haven't seen the relevant picture, but I suspect it's a lot better than your average picture on a TV, which tend to be taken from funny angles and be fairly low resolution.

Posted by ekr at 9:10 AM | Comments (2)

March 9, 2007

Remote control airplanes

According to this article Boeing is designing an "Uninterruptible Autopilot System". The idea here is that if the plane is being hijacked you trigger the UAS. Once it's engaged, the plane can't be controlled by the pilot but is remotely controllable from the ground.

It's important to remember here that there are two kinds of hijacking:

This seems to be mostly targeted at the second type of hijacking, since the first type depends principally on people complying under a threat and this threat is at least 95% as credible when the pilot is on the ground as if they're in the air with you. You just threaten someone on the plane instead of the pilot. On the other hand, if your objective is to fly the plane into some building, presumably the flight controllers on the ground won't allow that no matter how much you threaten to kill passengers.

That said, it's not entirely clear that 9/11-style hijacking can work even without this technology. If you're a passenger on a hijacked plane and you expect the hijackers to fly it into a building you've got a pretty good incentive to try to take back the plane regardless of what weapons the hijackers have. As has been often observed, 9/11 was successful because people's responses to hijacking were predicated on the assumption that it was the first type—who had heard of the second?

From a security guy's perspective, the interesting question is what safeguards are in place to prevent accidental activation and control by attackers? Imagine you get your hands on the control unit for an aircraft and can somehow get it put into UAS mode. Congratulations, you've got yourself a remote controlled, manned cruise missile. Presumably there are safeguards in place to defend against this sort of attack. Minimally, one would hope that some sort of cryptography is used to limit control to authorized units. The COMSEC techniques here are of course relatively straightforward.

The second thing you would like would is for it to be impossible to remotely put the plane into UAS mode, thus minimizing the damage from any compromise of a control unit/control unit keying material. For instance, the UAS radio receiver could be physically disconnected until engaged onboard the plane.

Even if proper COMSEC techniques are used, there still is a residual risk if it's possible to jam the signal. In order for the system to work, it needs to be fairly sensitive so that any attempt to take over the cockpit triggers it (hence the proposed pressure sensors on the cockpit door). But this potentially enables an attacker onboard to trigger the UAS while a confederate on the ground holds the plane hostage by jamming the control signal. Again, there are techniques to make signals harder to jam (though the authentication techniques you're likely to use make the control channel very sensitive to errors so you'd need a lot of forward error correction).

I'd certainly be interested in hearing more about the design of the proposed system. p

Posted by ekr at 10:15 PM | Comments (2)

January 12, 2007

You don't mind that my mail goes to Gmail, right?

The NYT reports that a lot of users forward their corporate e-mail to external Webmail accounts:
A growing number of Internet-literate workers are forwarding their office e-mail to free Web-accessible personal accounts offered by Google, Yahoo and other companies. Their employers, who envision corporate secrets leaking through the back door of otherwise well-protected computer networks, are not pleased.

...

Corporate networks, which typically have several layers of defenses against hackers, can require special software and multiple passwords for access. Some companies use systems that give employees a security code that changes every 60 seconds; this must be read from the display screen of a small card and typed quickly.

That is too much for some employees, especially when their computers can store the passwords for their Web-based mail, allowing them to get right down to business.

I'm sure annoying authentication schemes are part of the problem—though most of the organizations I know about only require you to use your SecureID card to make a VPN connection. Not that that's not annoying enough...

In my experience, the problem isn't security but rather usability. Probably the most important factor is remote access. People want to read their e-mail on their Blackberries and the companies often haven't been that good about installing the connecting software that the employees would need to do that. So, the employees install their own connectors on their desktop. Actually, remote access in general is a problem. Webmail is super-convenient and lots of companies don't or won't offer it. But you can help yourself by just forwarding your e-mail to Gmail.

Finally, there's the usability issue. Many enterprises run Exchange and expect their employees to use the matching MS e-mail clients. The reports I've heard from people who've tried are not exactly encouraging. On the other hand, Gmail's interface is actually pretty good, and you can also use Gmail with more or less any e-mail client of your choice.

Lawyers in particular wring their hands over employees using outside e-mail services. They encourage companies to keep messages for as long as necessary and then erase them to keep them out of the reach of legal foes. Companies have no control over the life span of e-mail messages in employees Web accounts.
This is absolutely a real concern, but it's a mistake to focus on e-mail here. It's actually incredibly difficult to avoid creating archival copies of sensitive information. First, many (most?) e-mail systems make copies to the local disk to enable offline work. At this point, it tends to end up in scheduled backups. Even if you manage to suppress this by forcing everyone to work offline or implementing local expire, employees routinely save data to disk and then it gets backed up onto permanent backups. Creating access control and retention policies that stick with the data through this kind of transformation is nigh-impossible with any operating system in common use (there's a close relationship between this problem and multi-level security, by the way). And this is if you control the systems people use. It's of course massively harder when you don't.1

"If employees are just forwarding to their Web e-mail, we have no way to know what they are doing on the other end," said Joe Fantuzzi, chief executive of the information security firm Workshare. "They could do anything they want. They could be giving secrets to the K.G.B."
OK, but this doesn't make any sense. First, if your employees want to give your secrets to the KGB , what they need isn't e-mail, it's a time machine. Second, if they want to give out your secrets, they're not going to forward them to Gmail, they'll bring a flash drive to work and copy all their data onto it. It makes some sense to be concerned about inadvertant information disclosure by employees, but once you assume that you're in an adversarial relationship then you've pretty much lost.

Paul Kocher, president of the security firm Cryptography Research, said the real issue for companies was trust. "If you can't trust employees enough to use services like Gmail, they probably shouldn't be working for you," he said.

I certainly agree that if you can't trust your employees not to intentionally give out your confidential information you're in big trouble, but I don't think it's right to extend this to whether you can trust them to comply with all your corporate IT policies. Just from reading this article (and from my personal experience) it's clear that if you followed that policy you'd have to fire a lot of your employees, including good ones—people who are at least to some extent trying to act in the best interests of the company by working more efficiently.

At a higher level, the relationship between corporate IT departments and individual users is often quite adversarial. The IT departments want to standardize everyone on a particular set of software and services and the users want to use software and services of their choice. When the official IT offerings become too restrictive (in the minds of the users) they often resort to self-help, as in this case.

1. There was some interest for a while in using various kinds of cryptographic techniques for this, but it never really took off and was still hard to get right.

Posted by ekr at 9:21 AM | Comments (2)

December 16, 2006

Skype voice stress analysis

KishKish has released a Skype add-on that does voice stress analysis (VSA) (þ ITwire). The American Polygraph Association (not exactly an unbiased source) claims that VSA doesn't work, but let's say it does work. How hard is it to counteract? The high-tech way is to build a filter that removes the signal that the analyzer on the other end is looking for. This probably isn't that hard, especially since the developers of the filter can use a local copy of the analyzer as an oracle to figure out whether they've got it right or not. The low-tech way to do this is to run a local copy of the voice stress analyzer and use it as a biofeedback monitor to detect when the analyzer on the other end would think you were lying. Of course, if you're dealing with someone who is running voice stress analysis on your phone call, you might consider finding new people to talk to.

Posted by ekr at 8:38 PM | Comments (1)

December 13, 2006

More quantum cryptography hype

Hack report has an interview with MagiQ CEO Bob Gelfond in which he claims that quantum cryptography is almost ready for prime time:
What's the standard price for the appliance and what is included?

For a point-to-point encryption you need a system that consists of 2 appliances - sender and receiver. The cost is around $100k, plus support depending on your needs.

When do you think we'll see service providers offer quantum cryptography services to their end-customers?

This will happen within one year and we'll see fairly wide adoption within the next three years. We are working with big carriers such as Verizon and AT&T as well as some companies that own fiber networks. The goal is to embed quantum cryptography into the technology infrastructure so it becomes totally transparent to the end-user. For example, if you are already leasing a fiber line, you can then add an extra level of security by activating the quantum service. The whole thing won't be disruptive to your infrastructure and it can sit on top of whatever you are using now. Since it won't interfere with your existing technology you can have a fall back mechanisms to switch back to whatever you have today.

The important thing to remember is that the security guarantees of quantum key exchange (such as they are) only apply when you have a direct link between point A and point B--i.e., if you're renting a fiber from AT&T between two offices. They don't apply if the data is being packet switched, such as in the kind of MPLS-style virtual cirtuit that people typically buy (because buying a dedicated fiber is too expensive). So, if AT&T sells you a QC-protected line, it just goes to one of their routers. Of course, AT&T could have a QC-link for each hop, but that's not an end-to-end security guarantee. You now have to trust each router in every data center.

Moreover, the current limit on a QC link is about 120 km. After that you have to use repeaters, which creates another potential point of compromise.

Apart from the usual high assurance customers, do you see any other industries that can benefit from (and justify) a quantum cryptography solution?

I think so, anyone who has to store and secure records for a number of years will benefit from it. One strategy eavesdroppers can deploy is to capture everything they can get their hands on. Even if they can't decrypt it today, they might be able to do that in a few years down the road. So the only way to defend against that is to use quantum cryptography. You have to make sure it's not just secure today but also going forward. Take healthcare for example, they have an obligation to protect my healthcare data forever. The real threat is that while theoretically current systems might be impossible to crack, the reality is that keys are not flipped frequently enough or might not be stored securely. All that can be used by an attacker to start a brute force attack. So if you have enough repeats it might just take them a couple of days to break them. And many companies do not flip their keys very frequently since it's a time-consuming task. In contrast if you deploy our system -- keys get flipped every few seconds -- automatically.

This argument confuses several points. What you have to know is that quantum cryptography systems like MagiQ's are actually used as what's called "quantum key exchange" mode. The bit rate of the quantum cryptography system isn't high enough to carry data so you use it to exchange keys which are then used in a conventional cipher like AES to encrypt the actual data. So, in that respect, QC systems are quite a bit like a conventional cryptographic protocol like SSL/TLS or IPsec, but with the QKE replacing the Diffie-Hellman/RSA/whatever.

Gelford's claim here is that the attackers are going to get their hands on (either by capturing off the network or getting access to your stored data) your encrypted data and mount a brute force attack on it in the future. In order for this to be plausible, one has to assume one of three things:

Case 1 is exactly the same for QKE and conventional systems, since however you exchanged the keys you have to store them somehow. In fact, if you establish a lot of unrelated keys frequently then in some sense this makes the situation worse because you need to store them somewhere that has a lot of space which makes using really secure storage methods more difficult. This is one reason people tend to store their keys under some master key. So, there's a tradeoff here that doesn't clearly favor QC.

Cases 2 and 3 are sort of the same. In both cases we assume you don't have the keys but you do have the ciphertext. Now, there are two major reasons why in this attack model you might want to use a lot of keys rather than one. The first is that old-style ciphers (e.g., Enigma) were often easier to attack if you had a large amount of ciphertext or ciphertext/plaintext pairs. I think this is what Gelford means when he talks about repeats. This isn't really true for any modern algorithm except when the amount of ciphertext gets really huge (~1010 bytes for DES, ~1020 bytes for AES). This only really applied to analytic attacks in any case. In a brute force attack, one or two plaintext/ciphertext pairs are enough. The other reason you'd want to use a large number of keys is to slightly increase the attacker's work factor. If he has to attack 100 keys rather than 1, then it's 100 times harder for him. Mostly, if you get to the point where the cipher is so weak (case 3) that you have to worry abot this you need a new cipher and QKE isn't going to help you much.

The other weird thing about this argument is that conventional systems are quite capable of rekeying frequently and if you use Diffie-Hellman, there's no real concern about exposure of long-term keys. Sure, people don't typically rekey their IPsec or SSL connections frequently, but it's a simple software change and certainly quite a bit more convenient than buying a bunch of gear from MagiQ.

Posted by ekr at 10:26 AM | Comments (1)