October 2007 Archives


October 30, 2007

Terence Spies pointed me to this item from the AACS licensing association:
AACS LA announces that it has started periodic "proactive renewals", which, primarily for software player applications, provide for periodic renewal and refreshing of AACS encryption keys by licensed manufacturers and eventual expiration of old keys by AACS LA. This helps maintain the AACS technology as a vital means of distributing valuable high definition content to consumers. Consumers should expect that updates/patches will be periodically offered by their software manufacturer in order to ensure that the players continue to function as intended. The upgrading of software is a common practice in the software industry. Pursuant to the AACS technology licenses, manufacturers of software players are required to perform such updates in a consumer-friendly fashion.
In other words, if you don't update, you won't be able to play new disks. That's not exactly a customer-friendly value proposition.

As I said earlier, this seems like an arms race that's going to be pretty hard for the manufacturers to win. It's really inconvenient for the customers to have to upgrade their players, and it's not like each new release is a simple matter of changing the key and respinning the distribution. If you want to stop the crackers from immediately extracting the key, you need to re-obfuscate the binaries so that they have to attack the binary again. The combination is not cheap for the manufacturers

In other news, Antigua-based Slysoft claims to have cracked Blu-Ray's BD+ copy protection.


October 29, 2007

When faced with a traffic blocking (or, as Comcast calls it, delaying) scheme like Comcast is using for BitTorrent, it's natural to ask how to evade the blocking. At a high level, there are two possible strategies:
  • Make your traffic resilient to blocking.
  • Hide your traffic so that it's not blocked by the ISP.

Resilient traffic
The blocking strategy Comcast is using is to forge TCP RST packets to kill the connection. The advantage of this strategy is that it's cheap; you don't need to touch any of the routers at all. Whatever packet inspection box you're using just sources a single packet to each sender, and the ordinary TCP mechanisms shut down the connection. By contrast, if you actually wanted to stop the packets from flowing the traffic you'd need to insert transient filters into some router, which isn't necessarily that convenient, especially if we're talking about a high-speed core router.

The good news from the perspective of the communicating parties is that this leaves open a window to evade the blocking. Richard Clayton observes in the context of the Great Firewall of China that if the peer implementations simply ignore TCP RSTs then they can't be blocked via this mechanism. Unfortunately, this interferes with TCP's normal operation, since RSTs do something useful. Worse yet, there are lots of other ways to interfere with a TCP connection. For instance, the attacker could forge FIN packets, simulating a normal close. Alternately, he could send fake data segments, breaking the protocol parsing at the next level up. The bottom line here is that TCP was not designed to be DoS-resistant, especially from an on-path attacker. That's not straightforward to fix, especially by this kind of crude modification to the TCP stack.

That doesn't mean that it's not possible to make the traffic resilient to blocking. The standard approach here is to use cryptographic message integrity/data origin authentication to prevent the attacker (Comcast) from inserting their own traffic into the connection. Unfortunately, this can't be done at the appplication layer above TCP (e.g., SSL/TLS)—the attacker is attacking the TCP layer and security protocols like SSL/TLS depend on correct functioning of the lower layer protocol to function correctly. In fact, SSL/TLS makes the problem worse, since any interference with the ciphertext causes integrity failures and connection failure. (This is the same reason why SSL/TLS can't be used to fix the BGP TCP connection reset issue.)

In order to be resistant to this kind of injection, you need to have message integrity at a layer below TCP. The standard solution here is to use IPsec, but you could also use a datagram transport protocol layered over DTLS. The important thing is that the traffic has to be authenticated below the connection management state machine.

Of course, all of these schemes can be blocked if you're just willing to inject filters into the router. The good news from the attacker's perspective is that these connections are long-lived and so you don't have to inject the filters that quickly. You also don't need to get every packet—TCP uses packet loss as a congestion signal and backs off, so if you can achieve an even modest packet loss rate, you can have a dramatic impact on the performance of the connection.

Hiding Traffic
The other major strategy is to stop whatever deep packet inspection (DPI) engine the ISP is using to detect filesharing traffic. The idea here is that the ISP only wants to block some of your traffic, since they want you to be able to use your Internet connection for other applications. So, you just need to stop selective blocking.

The natural way to do this is to use encryption. Even an encryption protocol like SSL/TLS that is above TCP does a reasonable job here, since it hides the application traffic. Interestingly, BitTorrent encryption doesn't seem to help here. I don't really know any details of BitTorrent's encryption, but presumably the issue is that there are unencrypted protocol elements that are specific to BitTorrent and so the DPI box can still do some traffic analysis.

Even with a generic protocol like TLS, the attacker can still do a fair amount of traffic analysis based on timing, packet sizes, etc. You also get the TCP port. BitTorrent doesn't use a single fixed port for data connections, so the attacker can't just block that port. However, the port range is somewhat predictable and doesn't overlap with ports for other popular protocols, so if you see a lot of data flowing on one of the potential BitTorrent ports, it's a good guess that it's BitTorrent. Note that if you use IPsec, then you can hide the ports from the attacker, but the packet size, timing information, etc., is still available.

The counter-countermeasure to this kind of traffic analysis is to send deliberate "cover traffic". When you want to send real traffic you just substitute it for some of the cover traffic. Of course, to do this well you need to chew up a lot of bandwidth on the cover traffic, which is unfriendly and hard on the rest of your performance.

The bottom line here is that an attacker who controls your Internet connection can always guarantee that you can't use it. The best you can do is make it hard for the attacker from selectively blocking some of your traffic and leave the rest of it alone.


October 26, 2007

Matthew Yglesias approvingly quotes David Brooks:
David Brooks really nails an important part of the internet experience:
Until that moment, I had thought that the magic of the information age was that it allowed us to know more, but then I realized the magic of the information age is that it allows us to know less. It provides us with external cognitive servants -- silicon memory systems, collaborative online filters, consumer preference algorithms and networked knowledge. We can burden these servants and liberate ourselves.
Right. I had a weird experience on Monday of playing on a pub trivia trivia team after not having done so for several years. Every time a question got asked that I didn't know the answer to, I felt this overwhelming urge to reach for my iPhone, a device I didn't have back in my earlier quizzing days. The idea of being limited to the information that was actually in my head was very distressing.

So, I started to write about how this is all really obvious and old news —I've seen people referring to their PDAs as external brains since long before they were even networked—but it's actually 6000 year old news. The first major technology that let you expand your ability to offload substantial amounts of work previously done by your brain was writing. The second was mathematics.

Computer scientists like to talk about a memory hierarchy: a computer can have a lot of different kinds of storage: registers, onboard cache (on the chip), offboard cache (on the motherboard), main memory, hard drive (typically a cache plus the disk), online tape, archival tape, etc. The general principle is that the further away you get from the CPU the larger the capacity is, but the longer the access time. So, performance is to a great extent limited by your ability to keep important data at hand. If the working set of your program is too large to be contained in the close/fast/small levels of the memory hierarchy, the CPU tends to idle as data needs to be moved in and out of memory levels.

Obviously, brains aren't constructed in this way, but you do have short and long term memory, and written words provide a form of ultra-long-term memory, as well as (of course) a way to communicate sections of your memory to others. One way to think about this second feature is that it's the ability to have things in your "memory" that you never actually learned. I.e., they never passed through your biological memory, you just look them up when you need them.

The disadvantage of this form of ultra-long-term memory, of course, is that it's unbelievably slow. I keep paper notebooks, but actually finding things in them can be quite difficult ("you can't grep dead trees"). An electronic memory is obviously much easier to find stuff in. The "remember stuff you never knew" feature is even worse. First, you need to actually find the book you're looking for, then you need to actually find the section of the book, then you need to actually digest the information. Only then is it in short term and you actually know it. Compare this to long-term memory, where you just need a reminder and it all starts to swap back into short-term (though this can take quite some time.)

So, the basic problem with paper memories is that the gap between them and the next step up in the memory hierarchy is just too large. One way to think of electronic memories is that they close that gap. At one level, that's great, but at another it still pretty much sucks. It's massively slower to find (let alone assimilate) things from the Internet than it is to remember them (assuming I actually can). What I really want is to just have the information piped directly into my brain. We're a ways away from that, but if we ever can, it will seem like every bit the miracle that the Internet does now, and Google will look just as clunky as books by comparison.

The above is all about memory, but there are actually at least four mental tasks you can outsource: memory, processing, input, and output. Our current technology lets us outsource all of these to some extent, but really it's quite far from what you'd like.

Required reading:
This sort of enhancement is one of Vernor Vinge's writing. "Bookworm, Run!", which is the first place I saw this kind "The Peace War" is very focused on outsourced processing. "Rainbows End" is a more complete vision of both the potential of this kind of technology and of the threats that come along with it.
P.J. Denning's "The Working Set Model for Program Behavior".


October 24, 2007

From James Surowiecki's piece in the New Yorker on supply side economica(þ Matthew Yglesias):
But, while Republicans still talk a good game about the need for spending discipline, in practice it matters far less to them than tax cutting. After all, if tax cuts pay for themselves, then there's not much reason to worry about restraining government spending--we can afford it all. In fact, if government spending grows too big, you can cut taxes again to pay for it.



October 23, 2007

Abdallah Higazy, an Egyptian living in the US, was arrested shortly after 9/11 because an air-to-air/air-to-ground radio was allegedly found in his room. He was suspected of somehow being involved in 9/11 or similar attacks and the FBI interrogated him. Higazy denied possession of the radio and the FBI (understandably) didn't take his word for it. Higazy asked to take a polygraph and, well, I'll let the court tell it:
Higazy alleges that during the polygraph, Templeton told him that he should cooperate, and explained that if Higazy did not cooperate, the FBI would make his brother "live in scrutiny" and would "make sure that Egyptian security gives [his] family hell." Templeton later admitted that he knew how the Egyptian security forces operated: "that they had a security service, that their laws are different than ours, that they are probably allowed to do things in that country where they don't advise people of their rights, they don't — yeah, probably about torture, sure."

Higazy later said, "I knew that I couldn't prove my innocence, and I knew that my family was in danger." He explained that "[t]he only thing that went through my head was oh, my God, I am screwed and my family's in danger. If I say this device is mine, I'm screwed and my family is going to be safe. If I say this device is not mine, I'm screwed and my family's in danger. And Agent Templeton made it quite clear that cooperate had to mean saying something else other than this device is not mine."

Higazy explained why he feared for his family:

The Egyptian government has very little tolerance for anybody who is --they're suspicious of being a terrorist. To give you an idea, Saddam's security force--as they later on were called his henchmen--a lot of them learned their methods and techniques in Egypt; torture, rape, some stuff would be even too sick to . . . . My father is 67. My mother is 61. I have a brother who developed arthritis at 19. He still has it today. When the word 'torture' comes at least for my brother, I mean, all they have to do is really just press on one of these knuckles. I couldn't imagine them doing anything to my sister.
And Higazy added:
[L]et's just say a lot of people in Egypt would stay away from a family that they know or they believe or even rumored to have anything to do with terrorists and by the same token, some people who actually could be --might try to get to them and somebody might actually make a connection. I wasn't going to risk that. I wasn't going to risk that, so I thought to myself what could I say that he would believe. What could I say that's convincing? And I said okay.
Transcription from Psychsound.

OK, so Higazy confesses, or, rather, the FBI coerced a confession out of him. I should mention at this point that his confession is a bit fishy:

Higazy then gave Templeton a series of explanations as to how he obtained the radio. First, he admitted that he stole the radio from J&R, an electronics store. Then he recanted this story, and explained that he found it near J&R. Higazy next denied ever seeing or possessing the radio. Templeton allegedly banged on the table and screamed at Higazy: "You lied to me again! This is what? How many lies?" Higazy then lied again, this time telling Templeton that he found the radio on the other side of the Brooklyn Bridge. Higazy recalled that Templeton "turned so red I thought he was going to hit me." Templeton accused Higazy of being a liar, and said that he would "tell Agent Sullivan in my expert opinion you are a terrorist." Finally, Higazy told Templeton that he had stolen the radio from the Egyptian military and had used it to eavesdrop on telephone conversations.

Now, this is inherently kind of fishy: air traffic control is analog VHF radio (in the 100-140 MHz range). This doesn't correspond to any telephony frequency of which I'm aware: cell phones are (1) at much higher frequencies, 800 MHz, 1900 MHz, etc. (2) almost all digital at this point. Landline cordless phones are also generally at higher frequencies. The frequencies that telcos use for microwave backhaul are all much higher as well. I'd also be pretty surprised if they're not digital. Some, though not all of them, of them are also digital. [corrected after more research -- EKR]. This isn't to say, of course, that one couldn't make a radio that would receive all these frequencies, but it's not something that one would expect to find in a typical air-ground transceiver, like say this one, which works on some fixed set of analog frequency bands. You'd be talking about a more generic scanning tool. So, the claim that he used this radio to listen in on telephone calls seems pretty hard to believe. And, of course, if you do believe that, it's a lot less of a national security issue than someone communicating with hijackers. Anyway, the FBI arrested Higazy.

At this point, it won't surprise you to discover that Higazy appears to have been totally freaking innocent, and luckily for him, that was discovered:

Three days later, on January 14, 2002, an airline pilot, who had been staying on the 50th floor of the Millenium Hotel returned to the hotel to reclaim his property. After inspecting his items, the pilot informed the hotel staff that his transceiver was missing. Millenium immediately contacted the FBI, which then verified that what was thought to be Higazy's transceiver was in fact the pilot's and that the pilot had not had any interaction with Higazy. The FBI re- interviewed Ferry, who revised his original account, this time explaining that the radio was found on a table in Higazy's room and not in the safe. The government withdrew its complaint against Higazy, who was released on January 16, 2002, after thirty-four days in custody. In a letter to Judge Maas, the government conceded:
The owner of the aviation radio had no interaction with Mr. Higazy. It is still unclear, therefore, how the radio was transferred from the room on the 50th Floor to Mr. Higazy's room on the 51st floor. Employees of the hotel have indicated that, although the hotel has been closed since September 11th, a number of people entered the room in which Mr. Higazy had been staying at different times between September 11th and the day on which the radio was found.

Higazy sued Templeton. The District Court granted summary judgement to Templeton on the grounds of qualified immunity. The 5th Circuit reversed. Here's where things get really interesting: the 5th Circuit's original opinion contained the above quote about how Templeton, uh, convinced Higazy to confess. Shortly thereafter, the court took down the opinion. How Appealing was already hosting a copy and the clerk of the court actually called to ask him to take it down; he refused. Subsequently, the court posted a new opinion replacing the description with:

This opinion has been redacted because portions of the record are under seal. For the purposes of the summary judgment motion, Templeton did not contest that Higazy's statements were coerced.

I'm not a lawyer, but I must admit to being a little puzzled as to why this is an appropriate matter to seal. If Templeton hadn't worked for the FBI and threatened a confession out of someone would that be sealable? If not, doesn't the public have a pretty significant interest in knowing what their law enforcement officials do? Whatever the reason, once you've made the mistake of posting this to a web site somewhere, trying to take it back just makes you look stupid.


October 22, 2007

It's now been pretty widely reported that Comcast is blocking BitTorrent (as well as other apps such as Gnutella and allegedly Lotus Notes) traffic. (Good summary by Ars Technica here and here).

The technical issue here is pretty straightforward; Comcast seems to be forging TCP RST (Reset) segments from one side of the connection to the other, causing the receiving TCP implementation to terminate its side of the connection. The evidence here is that people have taken packet dumps on both sides of the connection and neither peer is generating the RSTs, so it's clearly someone in the middle, and the pattern of which subscribers are affected looks like it implicates Comcast. Note: I'm going purely by others reports. I have Comcast myself, but I haven't tested this.

More interesting is the pattern of what is being blocked. According to TorrentFreak, Comcast is only blocking people seeding files:

Unfortunately, these more aggressive throttling methods can't be circumvented by simply enabling encryption in your BitTorrent client. It is reported that Comcast is using an application from Sandvine to throttle BitTorrent traffic. Sandvine breaks every (seed) connection with new peers after a few seconds if it's not a Comcast user. This makes it virtually impossible to seed a file, especially in small swarms without any Comcast users. Some users report that they can still connect to a few peers, but most of the Comcast customers see a significant drop in their upload speed.

The throttling works like this: A few seconds after you connect to someone in the swarm the Sandvine application sends a peer reset message (RST flag) and the upload immediately stops. Most vulnerable are users in a relatively small swarm where you only have a couple of peers you can upload the file to. Only seeding seems to be prevented, most users are able to upload to others while the download is still going, but once the download is finished, the upload speed drops to 0. Some users also report a significant drop in their download speeds, but this seems to be less widespread. Worse on private trackers, likely that this is because of the smaller swarm size

Assuming this is correct, Comcast is targetting files which Comcast users are serving to non-Comcast users. This mostly doesn't degrade your perceived performance if you're a Comcast user downloading content, but if you're (1) a non-Comcast customer trying to download traffic from a Comcast customer or (2) actually trying to push something into the P2P network, then this is going to seriously impact your experience. Since most customers are probably in the downloader category, this is actually a pretty attractive way to reduce network traffic without overly annoying too many of your users. By contrast, if Comcast just blocked all BitTorrent, then everyone trying to download the next episode of Lost would be pretty unhappy and would most likely be pretty intolerable to a sizable enough percentage of customers that you couldn't just stonewall.


October 17, 2007

I'm looking for a Web-based traffic school for a ticket received in Palo Alto. Have any readers investigated the options and want to share their experiences?

UPDATE: Apparently the only online traffic school you can use in Santa Clara County is DriversEd, which annoyingly requires you to take an in-person test at the end of the class. Has anyone done this? Trying to figure out if it's enough of a pain to make just taking the points attractive.


October 16, 2007

Check out this fairly impressive video of the Huber brothers setting the aided speed record (though with a lot of free and unprotected free climbing) of 2:45 for the Nose of El Capitan on October 8, 2007 (þ Eu-Jin Goh).

More video here.


October 15, 2007

A friend sent me this rant from an unsatisfied VoIP user:
I also have had problems with instabilty of the soft client as currently configured. My soft client just stops working or crashes entirely and won't close. So I end up killing it in Task Manager. Sometimes I can relaunch it. Usually I have to reboot.

That's not to say that you would have the same experience. While the system is not beta, my deployment is still part of a pilot, the purpose of which is to shake out such things. In my case the problem could be the soft client. Or something to do with Windows. Or a conflict with another application.

My favorite crash appears in the image below. This is my phone on crack. The outline of the soft client appears, but with a Microsoft Word document inside it. That happened while I was on a call. How do you hang up a call when you see can't see the controls? Fortunately, my head set has a button on it to hang up without relying on the soft client. But I had to reboot the machine to get the soft client - and my phone service - back runnning again. That's a pain in the neck.

In case you care, it's the Siemens client he's using.


October 14, 2007

I just noticed that Haile Gebrselassie set a new marathon world record at the Berlin Marathon on September 30th. The new record is 2:04:26, 29 seconds better than the previous best, set by Paul Tergat in 2003.
Re-watched Wargames last night and noticed something funny. I don't think it's a spoiler to note that the motivating factor for putting the computer in charge of missile launch is that NORAD runs a simulation and determines that 22% of missile commanders won't turn the key. A few notes about this. First, 78% launch success rate is pretty good. Given the amount of missile overcapacity had in the 80s—and still have, I suppose—it doesn't seem to me that this presents much of a problem.

Also, the point of deterrence is to make the enemy believe you'll destroy them if they attack. Once they have actually launched their missiles and they can't be recalled, launching your own missiles is just revenge—pretty questionable behavior if you're a consequentialist. Of course, maintaining a posture of deterrence requires a credible commitment to a strategic posture of retaliation (recall Hermann Kahn's advice about how to play chicken), or rather having your opponent think you have such a commitment. If that commitment is a bit rickety, there are two alternatives: shore it up or just don't let anyone find out.


October 13, 2007

Coming home from dinner tonight, Mrs. Guesswork noticed that a passing bus had "CALL 911" instead of the ordinary route number on the external LED display. We followed instructions and called 911, who said that others had called it in as well.
  • I wonder if buses have some "I'm being hijacked" button accessible to the driver.
  • I wonder what's going on with this one (the 911 operator didn't say.

If anyone in Palo Alto notices this on the news and sees what's up, can you post something in the comments?

The Swiss are somehow using quantum crypto to secure e-voting.
Developed by id Quantique in collaboration with the Australian company Senetas, the Cerberis quantum cryptography system will be used to protect election data relayed over a fiber optic connection. Unlike conventional Internet cryptography protocols, which use public key infrastructure, quantum cryptography relies on the principles of quantum uncertainty and generally involves encoding information into photons in a manner that will be noticeably and irreparably disrupted by any form of interception or monitoring. The cryptographic technique is still considered radically experimental, and this is one of the first practical applications of the technique.

Under ideal circumstances, quantum cryptography can ensure that communications between two parties have not been overheard. In the real world, however, quantum cryptography is subject to a number of different attacks. At present, any particle system is probably immune to such attacks because of the technical knowledge required to carry one out.

"We would like to provide optimal security conditions for the work of counting the ballots," said Geneva state chancellor Robert Hensler in a statement. "In this context, the value added by quantum cryptography concerns not so much protection from outside attempts to interfere as the ability to verify that the data have not been corrupted in transit between entry and storage."

I don't really understand what threat this is intended to counter. You've got some set of precincts (or whatever they're called Switzerland in) where the voting actually occurs. Those precincts are equipped with optical scanners, DREs or whatever, and they collect the votes. The votes (or maybe just counts) are sent to election central, where they're aggregated and the winners are determined. Based on this somewhat confusing article, they're using quantum crypto to secure that transmission over dedicated optical lines.

This seems both unnecessary and unwise. It's unnecessary because you don't really need to use a network to move the data. Just write it on CDROMs and drive or mail it to election central. The only reason to move it over a network is to make it a bit faster—something which seems sort of irrelevant if the election is being held within a single city. Even if you for some reason think that shipping stuff is too slow, you can always send the data over the network and then follow up with physical media as a double check. Note that the security issue here isn't primarily confidentiality—this data is sensitive but not that confidential, especially if you're just shipping tallies around. You just need integrity and you can double-check against the physical copies to match the preliminary results to the final results.

As for unwise, it's a really bad idea to have the computers at election central connected to the network, since that's a potential avenue for intrusion. Even if you use cryptographic (quantum or otherwise) access control to prevent any communication from the outside world, you're deliberately letting precinct devices connect to election central, which allows someone who has compromised a precinct device to potentially escalate privileges up to a compromise of election central. Since those devices generally aren't secured that carefully, That's not a really good design choice—even if we ignore the usual reasons why quantum crypto isn't that convenient.


October 11, 2007

Like Matthew Yglesias, I'm not particularly bothered by Ann Coulter expressing the opinion that the world would be better if Jews (and presumably everyone else) converted to Christianity:
COULTER: The head of Iran is not a Christian.

DEUTSCH: No, but in fact, "Let's wipe Israel" --

COULTER: I don't know if you've been paying attention.

DEUTSCH: "Let's wipe Israel off the earth." I mean, what, no Jews?

COULTER: No, we think -- we just want Jews to be perfected, as they say.

DEUTSCH: Wow, you didn't really say that, did you?

COULTER: Yes. That is what Christianity is. We believe the Old Testament, but ours is more like Federal Express. You have to obey laws. We know we're all sinners --


DEUTSCH: Welcome back to The Big Idea. During the break, Ann said she wanted to explain her last comment. So I'm going to give her a chance. So you don't think that was offensive?

COULTER: No. I'm sorry. It is not intended to be. I don't think you should take it that way, but that is what Christians consider themselves: perfected Jews. We believe the Old Testament. As you know from the Old Testament, God was constantly getting fed up with humans for not being able to, you know, live up to all the laws. What Christians believe -- this is just a statement of what the New Testament is -- is that that's why Christ came and died for our sins. Christians believe the Old Testament. You don't believe our testament.

I realize it's not considered polite to say this sort of thing in public, but let's recap the argument:

  1. We're all sinners (Rom 3:23).
  2. When sinners die, pretty bad stuff happens to them, even if it's just weeping and gnashing of teeth (Matt 25:30).
  3. Subscribing to Christianity is only way to escape this nasty fate. (John 3:16).

You don't exactly have to be Jack Chick to believe this stuff—it's pretty much the mainline Christian value proposition. And if you do, it seems like you might think it was pretty much a good thing if your fellow man subscribed to it as well, thereby avoiding an eternity of everlasting torment.

I do realize that phrasing this as there being no Jews is pretty offensive sounding—and of course Coulter specializes in that—but I don't think she's saying you can't eat latkes, just that you'd be expected to believe in Jesus, etc. Now, this isn't exactly a value proposition I'm particularly interested in either, but I don't see that it's really any worse than wishing everyone were a Republican, which I imagine Coulter does as well.

It should be relatively obvious that if you're a member of religion X, you probably think that the beliefs of religion Y are silly/wrong for any Y != X (and as Dawkins points out, atheists just think this for all religions). As a matter of civic politeness, people generally refrain from pointing this out, but that's just politeness, a collective version of refraining from pointing out that someone is wearing a really bad toupee.


October 10, 2007

In the comments on this this post about unredacting digital photos, Adam Roach writes:
Here's something that's confused me about the coverage of this case: whenever referring to the man in the pictures, the media has taken care to describe him as an "alleged pedophile."


Based on the descriptions of the portions of the photographs that haven't been published by the mainstream media, these are photographs of the man having sex with clearly under-aged boys.

I can see how you would need to be careful if you were attaching a name or specific identity to the statement -- any identified person would merely be an alleged pedophile until the case goes to trial and a conviction is obtained.

But the man in the photos? The man in the pictures that depict him engaging in pedophilia is a pedophile.

Well, yes and no.

First, I haven't seen the pictures—and no, please don't send them to me. So, I can't attest that they represent pedophilia at all. Rather, Interpol and those who have seen them allege that they do.

Look at it this way: the dude's face was obscured. Someone unobscured it. I think it's reasonable to assume that the unobscuring actually got the original face before the transformation was applied. On the other hand, it's certainly possible that the face we're now seeing was photoshopped onto the body of someone engaging in pedophilia (Rugbyjock, one of the Fark photoshop regulars, specializes in photoshopping people's heads onto gay pornography). If that were true, then while I guess it's true that the pictures show someone engaging in pedophilia, the referent of "the man in the pictures" starts to get a bit fuzzy.

To get a little more exotic, it's possible that the original source material was of someone having sex with under-age boys, but that the adult's face and body has been photoshopped to look quite a bit not like him. At this point, the referent of "the man in the pictures" starts to get extremely fuzzy. And then there's the possibility that the pictures were completely photoshopped, for instance, by photoshopping adults to look under-aged.

Do I actually believe any of these things are particularly likely? No. But they're not impossible and that's the sort of doubt "alleged" is intended to preserve.


October 9, 2007

Redacting digital information turns out to be a tricky proposition, at least if you go by how often people screw it up. The usual situation is some declassified document where the government has just put easily removed black boxes over the relevant text, but in this case it's an individual who made the mistake, a certain alleged pedophile who posted incriminating pictures. of himself with his face obscured by the Photoshop twirl filter. Unfortunately for him, it turns out that this effect is reversible:
Apparently, the suspect, or whoever handled the pictures, did not think it was possible to reverse the twirling, a capability that at least one Interpol official was intent on keeping confidential.

Now the cat is out of the bag. Officials are declining to say just how they did it, leaving Interpol in the strange position of urging the public to help find one pedophile suspect while refusing to divulge a tool that might identify others before they hear today\u2019s news and rush to delete potentially incriminating twirled images of themselves.

By publishing the untwirled photos of their suspect today, the international police organization also decided to risk the possibility that the man -- or men who happen to look like him -- may face violence from vigilantes.

Apparently, this effect is really trivial to reverse. According to this BoingBoing post you can just set the twirl filter to negative and you get the original picture back. Obviously, there are transformations which would be more complicated to reverse; for instance you could encrypt the relevant pixels or randomly permute them, though any fixed transformation which is one-to-one and onto should be reversible with enough effort. In addition, there are transformations which destroy information and are partly or wholly irreversible. The obvious case is replacing the relevant pixels with pixels all of the same color. This is of course simple, but apparently not as obvious as you might think.

Now where things go wrong with a lot of redaction operations—especially with formats like PDF—is that the basic formats are more complicated than bitmaps. The redactors just create a new black object that is in front of the the text to be redacted. The underlying information is still there, so it's just a matter of removing/ignoring the black object and you have the original text. It's much safer to work with a bitmap format where you know that you're changing the relevant pixels rather than just masking them. Of course, you also need to use a transformaton that actually can't be easily reversed.


October 7, 2007

OK, so I'm watching Mission Impossible and I've got some questions (spoilers follow after the break.)
ScienceNow reports on an interesting research project out of New Mexico:
The researchers used ads and flyers to sign up 18 lap dancers from local clubs. Each woman was asked to log on to a Web site and report her work hours, tips, and when she was menstruating. Lap dancers generally work 5-hour shifts with 18 or so 3-minute performances per shift. They average about $14 per "dance"--all of which is called a "tip" because it is illegal to pay for sex in New Mexico.

Over a 60-day period, the researchers collected data from 5300 lap dances. They divided the answers according to whether the dancers were in the menstrual phase, the high-fertility estrous phase, or the luteal phase. The result, as they report online this week in the journal Evolution and Human Behavior: Of the 11 women with normal menstrual cycles, those in the estrous phase pulled in about $70 an hour--compared with $50 for those in the luteal phase, and only $35 an hour for those who were menstruating. The other seven women were on birth control pills. They earned less across the board, and there was no peaking at the estrous phase.

The numbers suggest that men can tell when a woman is most fertile, although the message seems to be conveyed by "subtle behavioral signals" that evade conscious detection, the authors say. They add that the study couldn't identify whether it is scent or other physical changes that cue the men in, but they don't think it's anything obvious such as type of dance moves or "conversational content."

That's an interesting result. The paper doesn't seem to be online, but it would be interesting to know whether the variation is in the number of dances (18 over a 5 hour shift as opposed to 100 potential is a fairly low hit rate) or to the amount of tipping per performance. That might provide some indication of the nature of the effect.


October 6, 2007

I have to admit that I was initially pretty skeptical of U. Buffalo's proposed automatic terrorist threat assessment tool but now that Cory Doctorow—my lodestar to the reflexive geek position—has rubbished it (he compares it to phrenology), I figured I'd take another look. The basic idea seems to be to apply machine learning to videos of suspects being interviewed:
"We are developing a prototype that examines a video in a number of different security settings, automatically producing a single, integrated score of malfeasance likelihood," he said.

A key advantage of the UB system is that it will incorporate machine learning capabilities, which will allow it to "learn" from its subjects during the course of a 20-minute interview.

That's critical, Govindaraju said, because behavioral science research has repeatedly demonstrated that many behavioral clues to deceit are person-specific.

"As soon as a new person comes in for an interrogation, our program will start tracking his or her behaviors, and start computing a baseline for that individual 'on the fly'," he said.

The researchers caution that no technology, no matter how precise, is a substitute for human judgment.

"No behavior always guarantees that someone is lying, but behaviors do predict emotions or thinking and that can help the security officer decide who to watch more carefully," said Frank.

He noted that individuals often are randomly screened at security checkpoints in airports or at border crossings.

The question of whether this will work involves two subquestions:

  • Is this possible in principle?
  • Are our machine learning techniques up to the job?

It's certainly widely believed that techniques like this work in principle, and in fact can be made to work by human interviewers. After all, the police regularly use interviews to attempt to figure out whether suspects are guilty, and interviews are the basis of El Al's vaunted security measures. That said, there's data that suggests that humans aren't that great at detecting lies either. So, I'd say the jury is still out on whether it's possible to detect terrorists by observing their behavior in interviews. But certainly believing that it will work wouldn't put you outside of mainstream opinion.

That leaves us with the question of whether our current machine learning techniques can do the job. That seems a bit less likely; even our facial recognition technology doesn't really work that well and this seems like a rather harder problem. But that's why this a research project and being done at the University of Buffalo as opposed to being contracted out to Lockheed Martin.


October 5, 2007

I wanted to point to Marina Krakovsky's interesting Slate article about the same-day appointment movement for doctors (sometimes called advanced or open access). The basic idea is that whenever patients call, you try to offer them an appointment the same day. Most of the doctor's schedule is kept open for appointments the same day so you can afford to do this. I go to the Palo Alto Medical Foundation, which uses this policy for their general practititioners, and I can attest that it works quite well. I don't always get to see my own doctor, but I generally get to see someone almost immediately, which is really nice. They don't seem to have the same policy for specialists, and it's pretty noticeable when you want to see one.

One essential feature is that the doctor's practice needs to accept some overcapacity. A detailed description of the issue can be found is here but the basic problem is the discreteness of the time units. Say you scale your capacity to match average load. On days when you exceed your average load you need to turn people away. This creates a backlog, but on days when you are under your average load, you generally can't call people in off that backlog, so you gradually build up a larger and larger backlog.

It would be of interesting to know how the overcapacity required to make open access work compares to the overcapacity required to make scheduled appointments work. Obviously, in a perfectly scheduled system, you can work with basically no overcapacity, bringing in an extra person to help out when the backlog starts to get too bad. But real systems have two forms of variance: emergency appointments and no-shows. Emergency appointments require you to keep some overcapacity to service them (the carve-out model), or deny service. By contrast, no-shows produce unintended overcapacity (airlines deal with this by overbooking, but if you're a doctor you can't really tell someone in your waiting room that you can't see them and compensate them with a free trip anywhere in the US). This means that you end up just being idle and doing paperwork, going home, or whatever.

I haven't done any kind of literature search for this, but it seems like a relatively straightforward operations research/queueing theory question.


October 1, 2007

My Audi's builtin radio is fitted with an Radio Broadcast Data System (Radio Data System in Europe) receiver. In theory, the broadcasters can use this to tell you what you're listening to. In practice, not so much. First, support is spotty at best. Out of the 6 presets on my dial, no more than 3 actually tell you anything useful, like what song is playing. This is particularly baffling since I believe all of them actually do broadcast at least the call sign of the station, so broadcasting program-relevant information wouldn't seem to be much of a stretch. But no. Moreover, even the stations that do broadcast it don't em to do so consistently.

Second, the error rate is unbelievably high. Several times, I've been informed that I'm listening to PUBIC RADIO, which I'm pretty sure is only available on SIRIUS. I don't have a good explanation for this because RDS allegedly has error correction and given that the usable bit rate is supposed to be over 1 kbps, and your average song name is <<100 characters and the scrolling seems to happen about once ever 2-3 seconds, there's plenty of room for error correcting codes out the wazoo. And don't even get me started on the scrolling, which looks hideous.

Even once you get past transmission errors, RBDS still sucks. Today it inexplicably informed me that Tom Petty's Mary Jane's Last Dance was the Rolling Stones "Horses" (or rather the ROLLING STONES "HORSES"—I guess the extra bit per character for lower-case letters would have chewed up too much bandwidth), a song, which as far as I know doesn't exist. My guess here is that it was trying for Wild Horses (not to be confused with Wild Stallyns) and there was some transmission glitch, but I don't have any good explanation for how the radio station software got this confused.

Anyway, this seems like something that could be executed a lot more competently.