On the security of ZSK rollover

| Comments (10) | COMSEC DNS
Richard Barnes pointed me to the joint ICANN/VeriSign presentation from RIPE 59 (today in Lisbon) on their plans for rolling out signing of the root zone. For those who aren't up on DNSSEC, each TLD (.com, .net, .us, etc.) will sign the domains under it, but the design calls for the information for each of those domains to be signed as well at the root of the tree. There's some question about how important this really is from a technical perspective but the DNSSEC community seems convinced (wrongly in my opinion) that it's essential, so it's socially important even if not technically important.

Anyway, Richard pointed out something interesting to me: they plan to roll over the root Zone Signing Key (ZSK) four times a year (see slide 19) which doesn't really make sense to me. Actually, the whole key rollover scheme doesn't make much sense to me.

It might be helpful to start with a little background. The way things are going to work is this: ICANN is going to have a long-lived (2-5 years) Key Signing Key (KSK). The public half of this key will be built into people's resolvers. But the KSK will not be used to directly sign any user data. Rather, it will be used to sign a short-lived (3 months) ZSK [held by VeriSign] which will be used to sign the data. Because the relying party (i.e., your computer) knows the KSK, it can verify any new ZSK without having to get it directly.

Why are they doing this? As far as I can tell the rationale is as follows:

  • The security of RSA key pairs is directly connected to key length, which is also the length of the signature that the key pair produces.
  • Space in DNS packets is limited.

The combination of these two factors means that if you want to use longer (higher security) key pairs to sign zone data, you start running into size limitations in the packet. That's perfectly understandable, but why does having two keys help. The idea here is that you have a big (2048-bit) and a short (1024-bit) ZSK. But because the ZSK is changed frequently, you don't need as strong a key and can still get good security. I wasn't able to find a good description of this in the DNSSEC documents, but Wikipedia came through:

Keys in DNSKEY records can be used for two different things and typically different DNSKEY records are used for each. First, there are Key Signing Keys (KSK) which are used to sign other DNSKEY records and the DS records. Second, there are Zone Signing Keys (ZSK) which are used to sign RRSIG and NSEC/NSEC3 records. Since the ZSKs are under complete control and use by one particular DNS zone, they can be switched more easily and more often. As a result, ZSKs can be much shorter than KSKs and still offer the same level of protection, but reducing the size of the RRSIG/NSEC/NSEC3 records.

The only problem with this reasoning is that it's almost completely wrong, as can be seen by doing some simple calculations. Let's say we have a key with lifespan one year that requires C computations to break. An attacker buys enough hardware to do C computations in two months and then is able to use the key to forge signatures for the next 10 months (I'll try to write about keys used for confidentiality at some later point.) If we think about a series of keys, they will be vulnerable 10/12 of the time. Now, let's say that we halve the lifespan of the key to 6 months, which shortens the window of vulnerability to 4 months per key, or 2/3 of the time. But if the attacker just buys 2C compute power, he can break the key in 1 month, at which we're back to having the keys vulnerable 10/12 of the time. If we generalize this computation, we can see that if we increase the frequency of key changes by a factor of X, we also increase the attacker's workload by a factor of X.

More concretely, if we originally intended to change keys every 4 years and instead we change them every quarter, this is a factor of 16 (4 bits) improvement in security. Opinions vary about the strength of asymmetric keys, but if we assume that 1024-bit RSA keys have a strength of about 72 bits [*] then this increases the effective strength to around 76 bits, which is somewhere in the neighborhood of 1100 bit RSA keys, a pretty negligible security advantage and nowhere near the strength of a 2048 bit RSA key (> 100 bits of security). It's certainly not correct that this offers the "same level of protection".

The more general lesson here is that changing keys rapidly is nearly useless as a method of preventing analytic attacks. It's almost never practical to change keys frequently enough to have a significant impact on the attacker's required level of effort. If you're that close to the edge of a successful attack, what you need is a stronger key, not to change your weak keys more frequently. In the specific case of DNSSEC, just expanding the size of the packet by 10 bytes or so would have as much if not more security impact at a far lower system complexity cost.

10 Comments

There's another, somewhat more plausible justification for the KSK/ZSK setup: since the ZSK will be used every time a zone is updated, it will need to be in a more accessible--and therefore possibly less secure--location than the KSK. For example, if you have some kind of automated Internet-based update process for zone data, then the ZSK will have to be on a machine at least indirectly accessible from the Internet, whereas the KSK can be stored completely offline, in an air-gapped machine only accessible to personnel authorized to initiate an (manually initiated) key-signing process. This rationale may even apply to the root in the future, if the planned gTLD expansion results in enough gTLDs that automated management is necessary.

Of course, I don't know whether the root-signing folks considered this issue at all, let alone whether it influenced their design. I completely agree with (what I take to be) your basic point--that people generally don't think too clearly about key management and "cryptographic hygiene".

For FIPS 186-3 compliance, an RSA modulus is either 1024 or 2048 or 3072 bits. Otherwise, it is not a "NIST-approved" digital signature. (E.g. if you are a vendor, you should obey, and not design any improvement.)

It's a bit annoying that incremental RSA modulus size as suggested can turn into a debate just because the USG requirements are imposed on the DNS root signing solution.

Cryptography is not the only factor to consider.
There are rarely used software paths in DNS server
and resolver software only (or most likely only)
exercised during key rollover.
There are human factors, operational procedures
to be exercised enough frequently to ensure that
the procedures will work in case of an emergency
(key compromise).
And perhaps most importantly: DNSSEC keys for
'important' zones are a valuable asset and will
most likely be the target of all kind of non-
cryptanalytical attacks, 'social engineering', etc.

This all has been discussed at length in the past
in DNS related discussion lists.

For comparison: Servers in computing centers are
not rebooted per maintenance schedule, e.g. 2 to 4
times per year, because that would be necessary
for the OS and stable software installed there,
but in order to exercise the bootup procedures -
both software scripts (undergoing frequent changes)
and operational procedures, including new staff.

Since most DNSSEC code has not been tested yet
in such large scale deployment, a couple of updates
to active DNS software should be expected during
the DNSSEC rollout. These software revisions
should be tested in large scale deployment for
their key rollover behavior frequently enough.

Thus, it looks like the quarterly ZSK rollover
is an acceptable engineering compromise.

BTW: AFAICT, the KSK could be used 'forever' from
cryptanalytical point of view. The projected
2-5 year KSK rollover period mostly serves similar
purposes, to verify and test the operational
procedures necessary to be guaranteed to work
correctly in case of an emergency rollover.

Yes, I've heard this testing argument before. I think it's basically incorrect.
Because DNSSEC has no CRLs, you need short-lived keys anyway and so you
need to regularly re-sign the ZSK. This implies that most of the code paths
used in rollover will be exercised regularly regardless of whether you change
the keys or not. A small bit of testing should ensure to catch the rest of
the code paths.

I don't understand your point about "operational procedures" as a point
of vulnerability. Sure, that's true, but how does fast rollover fix that? Those
operators have near continuous access to the keys. If/when
the key is compromised, then you replace it at the next interval. Replacing
it five times in the 15 months beforehand doesn't buy you anything.

I realize that this has been discussed at length and there is some kind of
consensus on the mailing list. I think that consensus is wrong. And given
that the relevant RFC (4641) clearly misunderstands the cryptographic
issues, I don't think that's a crazy position.


The code path issue is a myth. There is no special code path that is exercised by the resolver, validator or server during a key rollover.

There may be a political reason for all this as well. It is clear ICANN and Verisign are just following requirements fron the us government via its department of commerce. From other correspondence [http://www.ntia.doc.gov/comments/2008/ICANN_080730.html last two paragraphs] it is clear the intent of this department is to force fit a split model that - yes - reduces security and stability in return for having some perceived power split and check between the two entities. But even with this arrangement, Verisign can do anything they want in the DNS. With ICANN only able to fix/stop things in 6 month intervals. SO it is not clear what splitting "roles" buys. It is clear from surveys like [http://ccnso.icann.org/surveys/dnssec-survey-report-2009.pdf page 7] who the Internet public wants to just run the whole thing.

There may be a political reason for all this as well. It is clear ICANN and Verisign are just following requirements from the us government via its department of commerce. From other correspondence [http://www.ntia.doc.gov/comments/2008/ICANN_080730.html last two paragraphs] it is clear the intent of this department is to force fit a split model that - yes - reduces security and stability in return for having some perceived power split and check between the two entities. But even with this arrangement, Verisign can do anything they want in the DNS. With ICANN only able to fix/stop things in 6 month intervals. So it is not clear what splitting "roles" buys. It is clear from surveys like [http://ccnso.icann.org/surveys/dnssec-survey-report-2009.pdf page 7] who the international Internet community wants to just run the whole thing.

There may be a political reason for all this as well. It is clear ICANN and Verisign are just following requirements from the us government via its department of commerce. From other correspondence [http://www.ntia.doc.gov/comments/2008/ICANN_080730.html last two paragraphs] it is clear the intent of this department is to force fit a split model that - yes - reduces security and stability in return for having some perceived power split and check between the two entities. But even with this arrangement, Verisign can do anything they want in the DNS. With ICANN only able to fix/stop things in 6 month intervals. So it is not clear what splitting "roles" buys. It is clear from surveys like [http://ccnso.icann.org/surveys/dnssec-survey-report-2009.pdf page 7] who the international Internet community wants to just run the whole thing.

KSKs are used to sign only the DNSKEY RRset comprising the ZSK(s) and KSK(s). ZSKs are used to sign the remaining relevant RRsets in the zone (maybe even the DNSKEY RRset depending on the implementation)

Since the amount of data signed by KSKs is much less than that encrypted by ZSKs, the KSKs are expected to be less vulnerable to cryptanalysis attacks. So they can be rolled over less frequently.

Does this sound like a valid line of thought ?

No, it's not a valid line of thought.

1. Any well-constructed modern cipher does not give you significant
amounts of attack leverage with any plausible amount of plaintext/ciphertext
available to the attacker.

2. Public key algorithms allow the attacker to produce as many plaintext/ciphertext
pairs as he wants simply by generating new data and public key encrypting it,
so in this case, the attacker doesn't even have more data processed with the ZSK.

Leave a comment