How to build a biometric ID for employment verification

| Comments (1) | COMSEC
Yesterday I criticized the proposed requirement of a biometric identifier for work authorization. But just because it's a bad idea doesn't mean it's not interesting to design one. Let's start with the observation I made in the previous post: smart cards are unnecessary here; what you want is a cryptographically protected object that contains the relevant data. E.g.,:

  • Name
  • Identification number
  • Biometric (photo, fingerprint, etc.)
  • Meta-data, e.g., whether they're allowed to work or not, expiration date, etc.n

Note that the actual information we're carrying here is mostly irrelevant from the perspective of security design. The cryptograhy protects whatever opaque data we happen to want to cram onto the card.

Physically, we have a huge amount of latitude in how we design the card; because all the data is cryptography, we don't need any physical tamperproofing features, just a format that will carry enough data (say 1-2k or so, depending on the size of the biometric)1. The data could be encoded on a mag stripe, smartcard (memory type, not necessarily active), or even a 2-d bar code such as a QR code (though we're probably not too far from the limit here).

Digital Signatures
The natural approach is simply to use digital signatures. The federal government would generate a single central signing key and use it to sign everyone's cards. The public key would just be published somewhere (e.g., in the Federal Register) and so anyone could verify people's identification. (You could also use a multitier model where the central signing key identifies subsidiary keys, etc.). In any case, if we're going to use digitally signed data, we have to contend with the problem of compromise of the central signing key(s?). Someone who stole such a key would be able to generate as many fake IDs as possible. Naturally, you'd want to store the signing key in a hardware security module, which substantially reduces the window of vulnerability, since you would have to steal the module, not just the key. Still, that could happen.

Obviously, from the moment when the key is stolen, you can't trust any signatures which are generated with that key. On the other hand, the signatures generated before the key is stolen are just fine. The problem is distinguishing the two. It's natural to just publish the compromise and state that "any signatures after this data are bogus", but as Jacob Davies observes, once you control the signing key, you can make it generate any signature you want, including one over dates in the past. [Technical note: if the keys are embedded in an HSM, then you could program the module to always put the correct time in its signatures. However, attackers could potentially extract the key, so you can't really trust this level of guarantee.]

Dealing With Key Compromise
Key compromise is a standard problem with digital signatures and has a standard solution: you have some timestamping service which records when signatures were produced, thus differentiating valid from invalid signatures. The time stamping service can just be a separate signing system or (more securely) a hash chaining system such as that used by Haber-Stornetta timestamping [*].

In a system like this, each signature (i.e., identification card) is linked together in a chain of hashes. When document i is signed, producing S_i, we compute H_i = Hash(H_{i-1} || S_i). Because the hashes are one-way, once we know any value H_j, we can verify that any prior value H_i is correct, provided that we know the elements i..j. The way such a system is used in practice is that the timestamping service (in this case the federal government) periodically commits to given hash values, for instance by publishing them in the federal register. From that point onward, it's not possible to create signatures with timestamps prior to the published value, even if the key is compromised, since they don't appear in the chain of hashes implied by the published hash value.

Eliminating Signatures
It should be obvious at this point that it's possible to dispense with signatures entirely: simply use the published hash chain values to verify each document directly. In order for this to work well, however, the verifier needs to have access to the hash chain value for every signed document, which is excessively expensive in terms of data storage on the card. Conveniently, there are techniques for providing a more compact representation.

The most natural approach is to use a Merkle Hash Tree. Hash trees work kind of like a hash chain but they allow us to compress the verification information for a large number of entries into log(N) values. Say, for instance, we want to verify 1000 documents, we would need to store approximately 10 entries (the sibling nodes on the path from each entry to the root of the tree) on each card. The idea here would be that we would form a hash tree from all the identity cards produced in a given day and then use the hash tree root as the input into the hash chaining process. This would give us a mechanism for verifying any given card without any digital signatures at all, provided that we can obtain each hash chain value. Note that you just need a trusted path for the most recent hash chain value: intermediate values are verifiable from the trusted value.

Hybrid Systems
While this system is very secure against history rewriting due to key compromise, it's not very timely. You can't verify any identifier till you've seen a published hash value that post-dates it. This means that there is a direct tradeoff between the amount of data you need to ship around and how long it takes before an identifier becomes verifiable.

We can get past this problem with a hybrid system: digitally sign the root of each hash tree and then use hash chaining to link them together. Any identifier which claims to be older than the latest known hash chain value is directly verifiable without the signature. Any identifier which claims to be newer is verifiable by checking the signature. This means that once you discover a compromised key, it's only usable during the time period (weeks? months?) before the next hash chain value is published. Note that this isn't the time period during which an attacker can sign with the key but rather the time period during which one can verify signatures made with it. As soon as a relying party has a copy of the next hash chain value, they will see that the signature in question isn't incorporated into it (this also provides a tripwire for surreptitiously compromised keys).

Obviously this isn't the only design, but it's one that's reasonably well suited to the environment with a single issuer who issues a really large number of credentials and relying parties who only need to verify fairly infrequently.

1.Note that if we want the cards to be usable without a scanner, then we need ordinary physical authentication and tamperproofing features of the kind used for drivers licenses, passports, etc, but this is orthogonal to the digital security features. Really, you have two separate identification devices in the same form factor, with the digital one being much stronger—if you can read the digital identifier all the security features on the piece of plastic are redundant.


A different design would be not use signatures at all, but instead store two random values on the card. The first would be an index number into some distributed database of values, and the second would be the key that would decrypt that data. Personal data would then be stored in a central database, encrypted with the second value. Anyone wanting to query the data would require a database lookup, but then the holder of the database could authenticate the querier and decide how much information to disclose. The use of the second key would allow the database to be distributed without compromise of the records. Of course, there are lots of other features to add here (like signing the entries, etc.)

Leave a comment