Initial impressions on ISP data retention

| Comments (1) |
The DoJ is starting to press ISPs to "keep records on the Web-surfing activities of their customers" [*].
The director of the Federal Bureau of Investigation, Robert S. Mueller III, and Attorney General Alberto R. Gonzales held a meeting in Washington last Friday where they offered a general proposal on record-keeping to a group of senior executives from Internet companies, said Brian Roehrkasse, a spokesman for the department. The meeting included representatives from America Online, Microsoft, Google, Verizon and Comcast.


While initial proposals were vague, executives from companies that attended the meeting said they gathered that the department was interested in records that would allow them to identify which individuals visited certain Web sites and possibly conducted searches using certain terms.

It also wants the Internet companies to retain records about whom their users exchange e-mail with, but not the contents of e-mail messages, the executives said. The executives spoke on the condition that they not be identified because they did not want to offend the Justice Department.

A few initial thoughts:

  • It's not purely a matter of data retention: it's not clear that the providers even have the records in question. Remember that the basic service that an ISP gives you is generic packet transit, which doesn't really require collecting any kind of records at all (your average router isn't exactly loaded with hard disk space). So, actually collecting this data could entail pretty substantial new equipment costs for the ISP.
  • Even if you did start collecting this data, you'd mostly have the IP addresses for the client and server, which might or might not let you identify which sites were being visited (see virtual host). The ISP could of course put a sensor (e.g., a Narus) on the wire to collect this data, but now you're talking real equipment costs.
  • If you're running some kind of application layer gateway like a Web proxy or an SMTP server (which is a very common way for end-users to send e-mail through their ISP) then you'd be more likely to collect this kind of information in server logs, so it's primarily an issue of storage cost then. Obviously, this goes double for search engines like Google which already collect your search terms as a matter of normal operation.
  • Keeping all this data around seems like it would entail a pretty substantial privacy risk. Do we expect ISPs to encrypt it? If so, who's going to hold the keys. (See Antonelli et al. for a paper on how to do this kind of encryption).

It would be nice to have some more details about what the feds are really going to insist on. So far, I've just seen this kind of vague summary.


Many ISPs nowadays have flow logs, to track DoS attacks and things like that. But storing the availability and integrity of that data might raise the bar quite a bit.

The real question, especially from a privacy point of view, is not what data you store, but what queries you support. There's a big difference between "show me what A did" and "show me who did X and Y, but not Z". The latter allows for very broad surveillance without naming any suspects before you begin. Getting a suspect's communcation records is not a big deal and often possible today.

Leave a comment