Should you check for weak keys?

| Comments (3) |
Nearly all symmetric encryption algorithms use unstructured keys, i.e., bit strings of a certain length. If we model an encryption algorithm as a function E
C = E(K, M)
Where C is ciphertext, M is the plaintext, and K is the key, then the way you generate K is simply to choose a random bit string of the appropriate length. The interesting thing is that some encryption algorithms have what are called "weak keys" which have special (bad) properties.

For example, DES has four weak keys, which have the property that encryption and decryption are symmetrical. Thus, if have a weak key K_w then

D(K_w, M) = E(K_w, M)

DES also has 6 pairs of semi-weak keys: pairs of keys K_w1 and K_w2 such that:

D(K_w1, M) = E(K_w2, M)

Back in the old days, it was widely believed that you should make an attempt not to use weak keys (i.e., you would check generated keys and reject them if they were weak) but it seems to me that recently opinion has turned towards ignoring the issue. Why?

First, it's not clear how bad choosing a weak key is for a real algorithm. In the case of DES, for example, the weak keys still encrypt the data, they just do it in such a way that if you ever encrypt it again with the same it decrypts it instead. This is a problem in theory, but in practice it's hard to imagine a common scenario in which this would happen, especially when you're using the algorithm in CBC mode. Remember that encrypting the same data twice in CTR mode always decrypts it, so if you want to build a secure system that's algorithm-agile you have to be careful about this anyway.

Second, the chance of randomly1 choosing a weak key in DES is very low. Even if you count semi-weak keys as weak, the chance of choosing one is about 1/252. What do you figure the chance that something will go wrong with your hardware or software and you'll fail to encrypt the data entirely is?

But let's ignore these issues and consider a case where weak keys were really weak, like they turn the encryption algorithm into the identity function. Now consider things from the perspective of an attacker. They're given some "ciphertext" X and want to try to decode it. To mount a brute force known plaintext attack, we do:

  1. For each potential key K_i
  2. Compute M'=D(K_i,X)
  3. If M' looks like a plausible message then exit.
  4. else try the next key.

If the cost to do a decryption is d, the cost to see if a message is plausible is p, and there are k key, then the average running time of this algorithm is kdp/2.

Now, if there are w weak keys, then the attacker does a somewhat different algorithm:

  1. Check to see if the message is plausible as-is.
  2. For each potential key K_i
  3. Compute M'=D(K_i,X)
  4. If M' looks like a plausible message then exit.
  5. else try the next key.
The running time of this algorithm is p+(k-w)pd/2, so the advantage to the attacker is (approximately) wpd/2, so it starts to look like you should avoid weak keys. But consider that if you do that--and we assume that the attacker knows how your implementation behaves--then he can run the original algorithm just skipping the weak keys, at cost: (k-w)pd/2, or slightly less than the cost if the message sender just ignores the weak keys. The problem here is that it's the existence of weak keys, not their use by message senders, that reduces the key space of the algorithm.

Now, in the particular case where the weak keys turn encryption into a no-op, you do have to worry about automatic traffic sniffers designed to data mine ordinary unencrypted traffic. Such packet sniffers would of course automatically pick up any "encrypted" data processed with weak keys. However, as a matter of practice, this doesn't happen with real algorithms. First, as previously noted DES weak keys aren't the identity function. Second, the identity property would only apply in ECB, not with CBC or CTR. So, some processing would be required (even if it's much faster than trying a new key) prior to checking whether the plaintext were plausible, which brings us back to the analysis before, with a new term for the "fast" processing. Once again discarding weak keys makes the attacker's job very slightly easier.

I should emphasize that this issue is mostly theoretical. No popular algorithm has a significant enough number of really weak keys to give you a significant chance of generating one by accident. But that's all the more reason not to incur the implementation complexity of bothering to reject them.2

1. Though note that all zeroes and all ones are both weak keys. One might imagine that if your PRNG failed you would get one of these. So, as a special case you might want to check for these to make sure your system is still functioning. On the other hand, there's always the chance that the check itself will screw something up...
2. The analysis here is about encryption. As this Wikipedia article points out, the situation is different in MAC modes, but standard practice these days is to use hash-based MACs rather than encryption algorithm based MACs--though of course those might have weak keys too.

3 Comments

[quote]if you do that--and we assume that the attacker knows how your implementation behaves--then he can run the original algorithm just skipping the weak keys[/quote]

That's only true if you can cheaply determine the list of all weak keys, which you might not be able to. You might be able to tell if a key, once generated, is weak, without being able to quickly enumerate all weak keys in order to skip them. You'd need to add in the cost ΓΈ of determining whether or not a key is weak in your "skip the weak keys" calculation.

Right, where cheaply means cheaper than the cost of trying the weak key over the ciphertext (remember that the identity assumption is unrealistic). And of course, for DES at least the list of weak keys is small and well-known.

The real problem with weak keys is what it says about the rest of the algorithm design.

Leave a comment