Software: March 2008 Archives


March 23, 2008

In response to lawsuits over missing emails, the White House claims that they have been following some somewhat unusual IT practices:
"When workstations are at the end of their lifecycle and retired ... the hard drives are generally sent offsite to another government entity for physical destruction," the White House said in a sworn declaration filed with U.S. Magistrate Judge John Facciola.

It has been the goal of a White House Office of Administration "refresh program" to replace one-third of its workstations every year in the Executive Office of the President, according to the declaration.

Some, but not necessarily all, of the data on old hard drives is moved to new computer hard drives, the declaration added.

In proposing an e-mail recovery plan Tuesday, Facciola expressed concern that a large volume of electronic messages may be missing from White House computer servers, as two private groups that are suing the White House allege.

Facciola proposed the drastic approach of going to individual workstations of White House computer users after the White House disclosed in January that it recycled its computer backup tapes before October 2003. Recycling -- taping over existing data -- raises the possibility that any missing e-mails may not be recoverable.

Some initial observations:

  • Every three years is a fairly fast retirement cycle. For comparison, the IRS depreciation schedule for computers is 5 years.
  • It's not clear to me that the hard drive destruction issue is that relevant. When you convert from one machine to another, it's by far easiest to simply move your entire mail archive over, rather than picking and choosing. If you do that, the primary difference between the old and new computers in terms of what data is available is going to be remanent data from explicitly deleted messages, which obviously is not on the new macine. First, most mail systems don't store data in large flat files (yes, yes, I know about MH, but I think we an assume Karl Rove does not use that) or databases, so it's reasonably likely that anything that old will already have been reclaimed and written over. Second, I would really hope that if the White House wants to securely delete something, they do better than just hitting the delete key and hoping.
  • I wonder what mail server logs are available. Even if the data has been deleted, many mail servers keep extensive logs. This could be used both for traffic analysis and as a guide to what should be found with enough effort. Of course, there's always the chance of remanent data on the server as well.
  • What you want is to have confidence that the data you want retained really is retained and that the data you want destroyed really is destroyed, not to rely on the relatively unpredictable properties of your media. It doesn't sound to me like this policy really achieves either. Of course, there is always the possibility that the White House is playing dumb and/or lying, but incompetence wouldn't exactly shock me either.

March 20, 2008

After all my complaining about the xml2rfc bibliography system, I decided to do something about it. I thought for a while about hacking xml2rfc itself, but after spending a while reading the crufty tcl code in xml2rfc, I decided it might be easier to do a secondary bibliography management tool in the style of bibtex.

There are two tools:

  • bibxml2rfc: the bibliography manager. It runs through the XML file, finds everything that looks like a reference, and the builds a pair of bibliography files (one for normative and one for informative) for inclusion into your XML file. It automatically fetches reference entries for RFCs and Internet Drafts. You can use ordinary XML inclusion techniques so you don't need to modify the XML file reference section when the bibliography changes.
  • bibxml2rfc-merge: a tool to make merged files for submission. The Internet Drafts submission tool won't accept a bunch of separate XML files, so bibxml2rfc-merge merges them into one for submission. You don't need this for your ordinary work.

The source can be found at: Documentation for bibxml2rfc can be found at


March 19, 2008

Ed Felten reports on inconsistencies in the vote totals reported by Sequoia Advantage voting machines in New Jersey (Note: these machines are different from the touch screen machines we looked at in the California TTBR, so I don't have any inside information.) Anyway, the anomaly is that the number of votes for Democratic and Republican candidates doesn't match the number of times that the ballots were activated. If the number of votes were less than the number of ballots, you could explain that as an undervote, but in the results tape Ed shows, the Republican ballot was selected 60 times and there were 61 votes!

I haven't thought much about potential causes (Ed's commenters theorize) but my money is on simple bugs in the system rather than an attack. If you were an attacker and you had managed to take control of the machine, one of the first things you would want to do is make certain that the results were consistent. Moreover, since this is a primary and not a general election, an attacker wouldn't really benefit from moving votes from one party to another. Much easier (and harder to get caught) to move them from one candidate to another within a party.

Not that this should make you feel any better, since the most basic function of voting machines is to correctly count votes. It shoud also make you wonder about both Sequoia's testing and the testing done by the certification labs. We already know that it's insufficient from a security perspective, but (assuming the problem is in the system), then this seems like it should have been caught by the testing/SQA process.

Sequoia's explanation can be found here. Felten says it's inadequate and that he'll explain why tomorrow. Stay tuned.