« Important safety tip | Main | Another TCP DoS attack: please, please, make it stop! »
May 18, 2005
How much would it cost to record every phone call?
Mark A.R. Kleiman says that NSA captures pretty pretty much all voice traffic and then sifts through some of it later:The only rational explanation I can invent is that the NSA's habit of catching everything that flies, while an open secret, is still officially a secret. And the practice, however legitimate, is almost certainly technically illegal.The wiretapping laws treat a conversation as having been "intercepted" (and, if it's a conversation between U.S. persons and no Title III warrant has issued, illegally intercepted) when the conversation is recorded, not when the record is transcribed. So if, as widely reported, the NSA records everything but only transcribes the international traffic it's legally entitled to listen in on, it's probably violating the letter of the law every day. I'm told that there is, as a technical matter, no way to intercept only conversations that cross national boundaries. Maybe Title III needs to be amended.
If you're a networking type, the obvious question is how practical this is.
First, we need to estimate the total amount of data involved here. I'm having trouble finding statistics on total wireline minutes (the FCC's stats are here but they only have minutes for InterLATA calls), so let's start with wireless, for which we can get good statistics):
Mobile Wireless Telephone Subscribers (June 2003) 147.6 million Average Monthly Wireless Minutes of Use (Dec. 2002) 427
This works out to about 5,000 cell minutes per year.
- At a typical data rate of 10 kilobits/second half-duplex, this works out to 4*108 bytes per person per year.
- At current hard drive prices of approximately $1/GB, this works out to about $0.32/person-year, or about $50 million/year for all the storage.
- Magnetic tape is more like $0.10/GB, so we're looking at more like $0.03/person-year.
So, the storage cost is extremely practical, but let's ask what the cost of the recording equipment is.
- 5,000 cell minutes/year/person and 150 million subscribers is 7*1011 total minutes used per year.
- At 500,000ish minutes per year, that's 1.3 million simultaneous calls.
- At 10 kbits/second per call, this gives us about 1010 bits/second of aggregate traffic.
- Antonelli et al. (2002) describe how to build a capture system that will run at about 100 Mbps. The equipment they use (dual PIII 866s) is fairly slow by modern standards, and I expect you could build a similar system for order $1k today. To capture all wireless traffic, you'd need about 100 such systems for a total cost of around $100,000. Pocket change.
- I'm not counting tape drives here, which run about $4000/unit, but even so we're talking about a million, even with some overhead for peak capacity.
Obviously these are back of the envelope estimates:
- We're only counting wireless calls. So, multiply by 2-5.
- We're assuming constant data flow, whereas real phone calls contain a lot of silence
- We're not counting the equipment to actually capture the traffic. If these are network connections, it's just simple network sniffing equipment, so figure a few thousand per unit. Of course, that means that you have the cooperation of the providers. If you don't and you're capturing the traffic out of the air, figure some additional fixed cost for the actual radio receivers. This might bring your fixed cost up to $10-$50,000 per unit. Even so, we're probably talking less than $100 million in fixed costs.
Bottom line, you should be able to tap all voice traffic in the US for order $100 million in fixed costs and maybe another $100 million in recurring equipment costs. The NSA's budget is reportedly around 3.6 billion.
UPDATE: Richard Akers is skeptical that the NSA actually records all voice traffic (see the comments section). I'm not saying they do, since I have no independent information here. I'm just saying that as a pure matter of cost it's fairly doable.
Posted by ekr at May 18, 2005 8:14 PM | Filed under:
Comments
I'm still somewhat skeptical that the NSA really records "everything that flies," at least without any source for that datum.
I actually write software that tracks calls in call centers for the purpose of recording them. ("This call may be recorded for quality assurance purposes.") The problem biggest problem with recording calls is that there a number of vendors with equipment out there and you have to work with all of them. They're not at all standardized. To the extent that they claim to follow standards, they don't really. The APIs were all originally designed for other purposes. And, of course, new versions of phone switch hardware and software come out all the time, much of which breaks existing functionality. It's just a giant mess, and the NSA would need to track all of it, with decent uptime, and without being able to get technical support for most of it. ("I'd like to open a support ticket. All of my illegal wiretaps in New York County are recording nothing but silence.")
They can, of course, simply record anything that's not silence. (I'm assuming that there are physical taps on all lines going out of the CO, since a legal wiretap may need to record any given line.) Many companies do that and then just trawl through all recorded calls at around the right time when they need to find a specific call. But they're doing that for a few hundred or maybe thousand call center agents at most. Not the entire United States population.
I'm not saying it's entirely out of the bounds of reason. The CO equipment is pretty much all going to be from major vendors, which does cut down on the total number of different technologies. The NSA wouldn't need much more than some idea of what the two endpoints for the call were, and they could get that from the signalling that accompanies the voice on some types of trunk lines. And $3.6 billion dollars will hire a whole lot of really good coders. But it's still an awesome task, and it needs better documentation than just that it's some sort of common knowledge.
Posted by: Richard at May 18, 2005 8:59 PM
Just to clarify a few things...
Yes, call center traffic monitoring has to deal with handling several constantly changing systems, which makes it a unique challenge. And I mean "unique" in the classical sense of the word. In other words, it's unrelated to the problem of what the NSA has to do to monitor calls. I think Richard is thinking of the problems of tapping into the PBX systems themselves (e.g. using a TAPI interface or similar; and there are products, like Genesys, that make this problem much easier).
If you're doing bulk traffic intercept, and care only about international traffic, that's a much easier problem to solve. The class 4 exchanges in the US are pretty standardized, and they're all going to use ANSI ISUP to set up and tear down calls. They're all going to use G.711 with TDM framing over standardized trunks. You really need only one solution, even if you're doing the equivalent of "sniffing" SS7 traffic (and, yes, the most commonly available STPs -- roughly equivalent to IP routers -- in the network have such a capability built in).
Additionally, there is a standard (ANSI-J-STD-025-B) which defines the interfaces for wiretap content delivery, which makes this whole set of problems even easier still.
The LECs and IXCs are not really in the business of checking up on the various wiretap agencies. I'm sure that, if some legally authorized agency called and said, "our legally mandated wiretaps in New York County aren't working," they'd get support for the problem without having to show up with a stack of warrants to demonstrate that they have a good reason to use the taps at that moment. And there are significant safeguards in place so that it is very difficult to tell who is tapping whom at any given moment. For example, if two agencies are tapping the same target, the system is explicitly designed so that they have no way to learn about each other unless they actually share notes.
I'm also confused by an assertion that "there is, as a technical matter, no way to intercept only conversations that cross national boundaries." Traffic leaving the US does so through a small handful (think on the order of a few dozen) transgates that perform conversion between ANSI ISUP and an international variant of ISUP (Q.761); they're generally converting the voice from uLaw encoding to ALaw encoding at the same time. Tapping into the network just on either side of these transgates would get you 100% of the international traffic from the US, and 0% of the domestic traffic.
My point is: there are half a dozen ways, at least, that this problem can be approached, and most of them would allow 100% logging of the voice and pen registers for all international traffic while capturing a small fraction or none of the domestic traffic on the network. $100 million is a good upper bound on how much such an effort might cost annually; I suspect a more realistic estimate (assuming the NSA hires bright guys, which I am led to beleive that they do) would be a tiny fraction of that amount. So, not only is such a feat acheivable -- from a budget perspective, it's probably noise.
Posted by: Adam Roach at May 19, 2005 8:55 AM