EKR: September 2008 Archives

 

September 30, 2008

I wanted to write something about hardware random number generators, but as I was writing I realized that it was probably worth starting with some background about random number generation in general. That's in this post. A future (hopefully near future) post will talk about hardware random number generation.

There are a lot of contexts in which it's convenient to have a ready source of random numbers and the requirements aren't the same. Consider two example applications: surveys and lotteries. [Thanks to Terence Spies for suggesting the survey example.]

In a typical randomized survey, you start with some set of people (e.g., a list of names or phone numbers) and you randomly select some subset of them. Then you call them and ask them what they think of Barack Obama, Jessica Alba, or ketchup. Then you can use straightforward statistics to estimate the probability that are going to vote for Obama or eat ketchup, or perhaps that like Obama and Banderas. For this application what's principally important is that your sampling function be unbiased, i.e., there be an equal chance of selecting each member of the set. For instance, consider what happens if you select people by height from low to high: you're going to get a lot more women than men in which case you probably don't get that good an estimate of how popular Jessica Alba is in the population as a whole.

In a typical lottery, customers get to pick a set of numbers and then the lottery operator draws their own set of numbers using some random number generation method. If you get the numbers right (or sometimes m out of n), you win. It's important to recognize that the critical property here is unpredictability: if gamblers can predict the numbers that the operator will generate then they can choose numbers that are likely to win. Uniformity actually isn't that important, though it makes things easier—lots of gambling games aren't at all uniform (think craps, for instance)—you just need the odds that the game pays off to match the probabilities (actually the inverses of the probabilities) of each outcome. Let's stick to the uniform case for now, though.

So, we have a need for two different kinds of random number generator: one useful in non-adversarial situations like surveys and one useful in adversarial situations like lotteries. In the latter case, we need the generator to be unpredictable to an attacker, which is much harder to achieve.

Back in the old days, people used to use tables of random numbers that had been generated by hand via dice rolling or somesuch. For obvious reasons, this works OK if you're doing surveys—though it's important not always to use the same numbers or you end up sampling the same people every time which of course skews the sample a bit, if only because people get annoyed with you calling them all the time. [Technical note: if you're going to sample repeatedly you want to have the samples be disconnected, otherwise you don't get any additional data when you take the second sample.] On the other hand, it's clearly useless for any adversarial application, since the attacker just needs to figure out what table you're using and they can predict the next numbers you're going to generate.

Now that we have computers, what's typically used are algorithmic pseudorandom number generators (PRNGs). These are functions that generate a uniform stream of values. One common design is to use a hash function or an encryption algorithm—for instance, you might use AES in counter mode. This is fine as far as it goes, but what do you use for a key? AES is a deterministic function: if the attacker knows the key, they can predict the output of the generator. That's why this function is pseudorandom, not really random. As before, we still want to generate a different stream of numbers each time we run the generator: you need to seed the generator with a different key each time you want to use it. For non-adversarial situations, you just need the seed to be unique—it's conventional to use the time of day. But for adversarial situations you need the seed (key) to be unpredictable to the attacker, which means that it has to be randomly generated.

This starts to look like turtles all the way down, but it's really not: if you can get a small amount of random data (e.g., 128 bits), you can bootstrap that up into a large amount of pseudorandom data that's unpredictable to the attacker by using the random data to seed your PRNG. This is how things are done in cryptographic/communications security applications (and what went wrong with OpenSSL on Debian was that this seeding got commented out of the code.)

In order to have a secure system in the adversarial context, we actually need one more property: given the output of the PRNG, it has to be computationally expensive to work backward to reconstruct the PRNG state/seed or forward from one output to the next output. Otherwise the attacker will just take output N and compute output N+1. Not all PRNGs have this property, though the AES-based one I described above does, as do the cryptographically secure PRNGs that people typically use. For a practical system, there's one more property you want: to be able to inject new seed data at any time, not just at the beginning and the PRNGs that people typically use have this property. It's a little tricky (and beyond the scope of this post) to figure out how best to mix in the new seed data, but it's a reasonably well understood problem nonetheless.

I want to make one more point: the output of a badly seeded cryptographic PRNG looks pretty similar to the output of a well seeded PRNG. While there are statistical tests you can use to measure the quality of random numbers, they only really test the generator, not the seeding. The only real way to determine whether a cryptographic PRNG is well-seeded is to try out candidate seeds until you find one that generates the output you're seeing, which is sort of a non-terminating process if the PRNG was well-seeded. Unfortunately, this means that if you want to be sure your PRNG is strong, you actually have to do so by construction—by making sure that the seeds are correct—you can't get a sufficient degree of confidence by testing the output.

 

September 27, 2008

Back at USENIX Security 2008, California Secretary of State Debra Bowen gave a keynote talk on e-voting where she showed a (now famous in voting circles) bumper sticker which read: "Prevent unwanted presidencies. Make vote counting a hand job"

I hadn't been able to find a copy of it until a day or two ago when I saw it on Lucas Mearian's writeup of Bowen's appearance at the MIT Emerging Technologies Conference. For some reason, Mearian has taken down the picture of the sticker and renamed his post to "Prevent unwanted presidencies with paper ballots", though the URL path still reads "voting_should_be_a_hand_job".

 

September 23, 2008

[Terence] Spies alerted me to this story about the state of Kentucky trying to take control of 141 domain names in an effort to block Internet gambling:
FRANKFORT -- Kentucky is commandeering 141 domain names of Internet gambling sites in a novel legal move to crack down on the unregulated industry.

Franklin Circuit Judge Thomas Wingate ordered the names transferred to the state last week, Gov. Steve Beshear announced Monday.

Sites affected include such names as caribbeangold.com and sportsbook.com.

If officials get their way at a Sept. 25 forfeiture hearing, the state will control the domain names and can ask Web registrars to block access to the sites, said Justice and Public Safety Secretary J. Michael Brown.

Putting aside for a moment one's opinion of trying to stop Internet gambling, this seems like a seriously problematic operating theory. Sportsbook isn't located in Kentucky (it seems to be based in Malta) and sportsbook.com is registered with Network Solutions, which is based in Herndon, VA. As far as I can tell, the only relationship that they have with Kentucky is that they let people from Kentucky gamble on their site. If we're operating on that basis, then what's to stop the city of Grano, North Dakota, population 9, from deciding that Google is violating some local ordinance and seizing google.com. Sure, maybe they're operating on some bogus pretext and Google will eventually prevail, but in the meantime all your searches turn up businesses in North Dakota. This doesn't really seem viable.

Of course, you could argue that Sportsbook could just not do business with everyone in Kentucky, but this is impractical for two reasons. First, geolocation technology isn't really good enough for Sportsbook to determine for absolute certain where every potential user is. Second, even if they could be sure, it's not really practical for every domain holder to know the laws of every potential jurisdiction their customers might come from—it's entirely possible I'm violating the laws of Grano right now.

If you want to actually have a working Internet, then, you really need to arrange matters so there's not a semi-infinite set of attackers who can trivially bring down anyone's domain.

 

September 21, 2008

Yan, Malisch, Hannon, Hurd, and Garland have an interesting (though ultimately more suggestive than conclusive) paper in PLoS (link goes to abstract but PLoS is free so you can read the whole article). The background here is that 2D:4D, the ratio of index finger to ring finger length, is lower in male humans than female humans. It appears that this is correlated to prenatal androgen exposure as well as to gender: female fraternal twins whose co-twin was male tend to have lower 2D:4D than those whose co-twin was female. It also turns out that low 2D:4D ratio in humans correlates with a bunch of measures of what you'd think of as masculinity: aggression, competitiveness, etc. Interestingly, the opposite is true of mice: higher 2D:HDs correlate with higher levels of aggression and physical activity. Also, in mice females exhibit higher exercise levels than males. What YMHH&G have shown is that in mouse lines bred for increased propensity to run on treadmills: 2D:4D is higher than in un-selected control lines.

It's hard to know what to make of this general set of results and the authors grapple with it a bit (the selected mice are feminized, masculinized,...), but don't have any clear answers, though they suggest that it's more complicated than either. In any case, it's interesting that (a) you can actually select for propensity to exercise, at least in mice, and (b) that that somehow seems to be linked to a bunch of other biochemically significant markers.

 

September 20, 2008

Sorry for the long posting gap... Was busy working on a paper.

This morning I ran the Skyline to the Sea 50 km trail run. This was my first ultra of any kind and only my second trail race, so it was an interesting challenge. The Skyline to the Sea trail is net about 2500 ft descent (with about 3000 ft of ascent), so I figured this would be a good race to start with.

First impressions:

    The course was surprisingly technical. I do a lot of trail running, but mostly it's fire-road style trails like Rancho San Antonio. Even when there is some single-track, it's relatively flat dirt, so you can mostly run without paying much attention to where you step. This trail, however, while flat some of the time had significant stretches with roots, large rocks, etc. that you had to really slow down or even walk over if you wanted to avoid tripping.
  • A lot of trail runners report walking the uphills. I've done enough long distance races (1/2 Ironman and Ironman) that I felt I could probably get away with running them. This wasn't a disaster and I did pass a number of people on the uphills, but I didn't put that much distance on them (in some cases, the same people passed me on the downhill) and I suspect it took more out of me than it was worth. I also suspect that running the uphills wouldn't be scalable to 50 miles or even 50 K races with more climbing.
  • Hydration and nutrition become a lot more important in races like this (indeed any long distance race) because aid stations are fairly far apart (and constrained by which parts of the trail are easily accessible) and times are a lot slower for the reasons I indicated above.
  • Bees. There were lots of bees. You'd be running along just fine and the people in front of you were screaming and hitting themselves. I got stung about 6 times.
  • The race was reasonably well run, but there were two logistical issues that could have been better. First, it would have been nice to have mile markers. The trail was well marked, but without any distances, so you didn't know how far you had gone. Second, there weren't enough porta-potties at the race start; only three for 200 entrants. I spent a long time waiting in line for one and almost missed the race start. As it was, I was near the end of the start line and spent most of the race trying to pass slower people on the singletrack trails.

Results: 5:15ish. 33rd, 15th in my age group. As I recall, the winning time was 3:38.

 

September 13, 2008

OK, so I know this is trivial, but it still really annoys me.

NPR has this news quiz called Wait, wait... don't tell me. They have three comedians answer questions about the week's events. The last "event" is a "lightning fill in the blank" round where they have to answer as many questions as they can in 30 seconds (or whatever). [BTW, Win Ben Stein's Money was much better on this front, because the questions were the same for the contestant and for Stein]. Anyway, before the LFOTB round, the contestants are typically within two or three points, and each question in the round is worth 2 points. The contestants participate in order of increasing score, so what happens is that the first contestant (the one with the lowest score) always ends up with the most points as soon as he's gone. Carl Kasell then announces "Bob got 5 correct answers, for a total of 13 points and he has now taken the lead," at which point I can barely stop myself from screaming at the radio "No, no!" It's not sensible to talk about someone "taking the lead" when the other contestants haven't even gone yet and when if they get any reasonable number of questions right, they themselves will be in the lead.

 
Eu-Jin Goh pointed me to this Times article about the price of Ironman series events. It's certainly true that the Ironman series is expensive, and $525 for a race entry is a lot, but it's not really the major part of your costs. Let's say I decide to do Ironman Canada, the oldest of the non-Hawaii Ironmans. Assuming the entry fee is $525, my costs look like this:

Plane fare (SFO - YYF)$575
Bicycle surcharge$50-170
Hotel (4 nights)1$320
Rental car$150
Race entry$525
Total$1740

Plus, I've never been at a race where I didn't end up spending $50-$100 for assorted race-related expenses: spare bottles, tubes, energy bars, energy gels, sports drink, etc. Factor in airport parking and other misc. non-race expenses and we're looking at somewhere around $2K for the race, of which the entry fee is about 25%. And of course this is for a North American race with a cheap plane flight located in a relatively cheap town. If you want to do Ironman New Zealand (another popular race for North Americans, entries still available!), you're looking at a $1300+ plane ticket plus a few more nights in the hotel dealing with jet lag, for a total price more like $3K.

This isn't to say that monopoly rents aren't being collected—though it may not be the race directors collecting them. It may well be the case that running an Ironman race is more expensive than a non-Ironman race (because of the licensing fees you need to pay to the World Triathlon Corporation which owns the Ironman name), but that's just a transfer payment, so someone is getting the extra $200 or so you pay to enter Ironman Canada instead of Vineman. All I'm saying is that it's not too surprising people are willing to pay it, since the brand surcharge you're paying is a relatively small fraction of their overall costs.

1. You might ask why you need so many nights in the hotel. The race is first thing in the morning so you need to stay the night before. You're way too wiped out the second day to fly out, plus if you qualified for Kona you need to pick up your slot the next day. The other two days get there because you want to be fresh for the race (remember, you've been training for 6 months for this) and that's not that compatible with having just got off the plane the day before the race or even two days before. Much better to get there a few days early, rest up and see the course.

 

September 12, 2008

Writing in the comments section, Nick Weaver expresses the not unreasonable concern that if you randomly select people who are eligible for extensions, people will try to game the system:
If you do the randomization suggestion, you really don't want to do it for a security conference, because then you will get security researchers playing games (eg, 3-4 different titles/abstracts/author orders, withdraw the lowest extending ones)

This really is the way security people think, but I think it is possible to design technical mechanisms to stop people from doing this. One obvious thing you could do is simply forbid any individual from being an author on two papers that apply for extensions. This sounds a bit restrictive but one could certainly argue that if you're asking for extensions on two papers, you don't have time to work on them both and should just focus on one.

It seems like there is a less restrictive approach, however, namely to use cut-and-choose. Nick's attack relies on people submitting extension requests for papers they don't really intend to submit. This implies a countermeasure where you force authors to prove that their requests are legitimate. You can't really do this for the papers where you reject their extension requests (since they could reasonably claim they're not ready to submit, which is why they asked for an extension), but you can for any request where you grant an extension:

Here's one natural approach:

  • Force people to submit all their requests at once before you tell them whether any are granted.
  • Randomly (cryptographically securely random) select whatever subset of the papers you're going to select.
  • Require the authors to actually submit papers for which requests were granted.

(Obviously you'd need to notify authors ahead of time that these were the terms).

Now, obviously, an author who is trying to cheat you can submit bogus papers (or slight variants), but I'm assuming that if you actually force them to submit, the PC can detect them and appropriately punish the authors.

I'm being deliberately vague around the words "require" and "punish" here, because we're now leaving the realm of cryptography and entering the world of politics. One thing you could do is simply refuse to accept any papers unless the authors submitted all the papers by that author. More severely, you could refuse to accept any papers from them in the future (this sanction is sometimes used in other cases of clear misconduct). In any case, it's clear we could create sanctions that authors wouldn't like.

Let's do the math on how well this would work. For mathematical convenience, assume that you are only submitting a single real paper and that if you're caught cheating is that all your submissions for this conference are summarily rejected. Assume further that if the extension isn't granted, you don't submit. Thus, we have the following outcomes:

OutcomePayoff
No extensions granted  0
One extension granted  1
>1 extension granted  0

If the probability of extensions being granted is p and you submit n times, then the probability of each of these events is:

OutcomeP(outcome)
No extensions granted  (1-p)n
One extension granted  p*(1-p)n-1*n
>1 extension granted  1-pn

However, because we've assumed that the payoffs for the first and third cases are zero, and the second case it's 1, the payoff is just the second case, i.e., p*(1-p)n-1*n. To find the best strategy for the submitter, we want to find the n that gives the maximum payoff. Since we can only choose discrete values of n it's easiest to find this value experimentally. Here's the relevant R code to find the maximum:

pay <- function(p,n){
    n* p * ((1-p) ^ (n-1))
}

max.pay <- function(p){
  n <- seq(1,100)
  pays <- sapply(n,pay,p=p)
  n[order(pays,decreasing=TRUE)][1]
}
(note 1: this code does not work properly for very small values of p where the optimal n value is > 100. That can be fixed by making the maximal n value bigger. note 2: we happen to know that there is a single maximum so we could do a loop that stopped as soon as p_{n+1} < p_n, but that was too much effort.)

It turns out that for all p >= .05, the optimal number of submissions is 1, which is the behavior you want to encourage. For p < .5, it makes sense for the submitters to do multiple submissions (with the number increasing as p increases). Accordingly, if you want to give extensions less than 1/2 the time, you need more draconian punishments (e.g., banning from future submissions) to enforce that.

I should note that this sort of suspicious attitude probably isn't the best way to get the result you want. In general, the academic community depends on people's voluntary compliance with norms (we don't really check that people don't make up their data, for instance), and if you simply tell people that this is the norm for this conference (and perhaps make them explicitly affirm their compliance), I expect they will do so. On the other hand, if you make it a game, they're likely to take it as a challenge to cheat (especially in the security community) (cf. the famous A fine is a price.). On the other hand, that's less fun than designing a new mechanism.

For extra credit, consider the following extensions to the model: (1) the punishment is greater than just having all your papers rejected. (2) authors have more than one paper. (3) submitting without an extension is less good than submitting with an extension but still of nonzero value.

 

September 11, 2008

For those who are thinking of submitting to ISOC NDSS, the PC has decided to extend the conference deadline to Fri Sep 12, though you have to submit your abstract and title by midnight tomorrow. This has been announced in some fora already and will appear on the site tomorrow.

Because so much CS publication is done at conferences, the work cycle tends to be driven by the submission deadline. These deadlines tend to be of varying hardness—sometimes people ask for and are granted extensions, but not uncommonly the program committee grants general extensions (a week is common here). If you're preparing a paper for such a conference, learning that the deadline has been extended seems like a nice bonus, if a little anticlimactic; people tend to work up to the deadline and so suddenly getting another week cuts down on the rush job aspect of things. On the other hand, since work tends to expand to fill the time available, it's a bit of a mixed blessing.

But of course these benefits only accrue to the submitters if the deadline extension is unexpected. If you know that there will be an extension (and NDSS is famous/notorious for having the deadline extended every year), then you just factor the later deadline into your planning and it's as if the extended deadline were just the real deadline (cf. rational expectations theory) and the PC might as well have just set that deadline in the first place. [Note that one could argue that the PC learns new information as the deadline approaches and people ask for extensions so they're correcting, but (1) since people plan to finish at the deadline it's not clear that having a later deadline would change anything and (2) even if this were true, once the conference has had a couple years to settle in, you'd expect the deadline to get calibrated pretty accurately.]

Paradoxically, then, if the PC thinks that having people shoot for time X but then actually have till time Y improves papers (or perhaps just author experience), then they need to preserve uncertainty about whether a deadline will be extended by sometimes not extending it. How often you have to do so is a more complicated calculation, of course, but given that your typical conference happens only every year and people tend to forget events more than a few years in the past, I suspect that you can't extend the deadline more than 75% of conferences or so.

Hovav Shacham observed to me that you could both provide the requisite uncertainty and try to establish whether extensions improve paper quality by only giving extensions to some random fraction of papers each year [tech note: you would need to force people to commit to their name or paper submission before telling them whether they got the extensions, since otherwise they might just poll until they got an extension] and then see whether those papers had a higher acceptance rate. In my experience, though, people tend to feel that giving some people unequal treatment—even when that treatment is randomly distributed—is somehow unfair.

 

September 8, 2008

For some reason, Slate continues to let Gregg Easterbrook write about science. This time, the topic is Thomas Friedman's new "Hot, Flat, and Crowded". I'm not really interested in engaging with most of the piece, which is pretty much the party line [yes global warming is happening but it's not clear it will be that bad; other problems (poverty, clean drinking water, etc.) are much worse; Thomas Friedman is a hypocrite who lives in a big house; the government will screw things up if we let them regulate; industry is innovative!], but the following is just confused:

Friedman concludes Hot, Flat, and Crowded by proclaiming greenhouse damage could cause humanity to be "just one more endangered species." Better to consult history on this topic. Greenhouse gases are an air-pollution problem. Smog and acid rain, the two previous serious air-pollution problems, once were viewed as emergency threats. Then federal standards were imposed, and inventions and new business models were devised; now smog and acid rain are way down in the United States and declining in much of the rest of the world. And no international treaty governs smog or acid rain! Nations have adopted smog and acid-rain curbs because it is in their self-interest to do so. The same dynamic will take hold for climate change, not long after the United States finally imposes greenhouse-gas rules. Unquestionably the future is flat and crowded. Hot? Maybe not.

I suppose one could argue that GHGs are an air pollution problem, but precisely what distinguishes GHG emissions from other air pollution problems is that the effects aren't localized. In the case of both acid rain and smog (especially smog), the effects are felt near the emissions source so although there are the usual public good/incentive problems, regional authorities can restrict pollutant outputs in their areas and thus control the effects of the pollution. The Beijing olympics provide a good example of this: the air quality in Beijing is bad because of loose emissions controls, but that doesn't mean the air quality in Bali is bad, let alone the air quality in Lake Tahoe. Moreover, when the Chinese decided to improve Beijing air quality for the Olympics, they were able to do so (at least to some extent) without having to consult with every other country on Earth.

By contrast, the effects of GHG emissions are felt globally regardless of the source, so this is a pure public good (well, public bad) issue, with all the attendant collective action problems. We don't have much incentive to control our emissions here in California if people elsewhere aren't going to. Moreover, it seems unlikely that global warming will have equally negative effects on everyone, so the incentives may be even more misaligned (if, for instance, low-emitters are likely to suffer worse effects).

Even more annoyingly, if Easterbrook actually wanted to pick an apropos example, rather than just being glib, there is one readily at hand: chlorofluorocarbon (CFC) emissions. CFCs were widely used as refrigerants and aerosol propellants, but it was discovered in the 1970s that when released into the atmosphere they caused ozone breakdown. Like GHGs, the effects of CFC emission don't occur near the source (in fact, CFCs are the likely cause of the Antarctic "ozone hole"). The good news is that CFC emissions are way down and it looks like the ozone depletion may be starting to slow. How did this happen? An international treaty, The Montreal Protocol, was negotiated (Wikipedia says it took effect in 1989) [BTW, check out the Wikipedia article for the 1980s-era industry denials that anything was wrong.] dramatically restricting the use and emissions of CFCs. Funny that Easterbrook chose to talk about smog and acid rain instead.

 

September 7, 2008

Stanley Fish riffs off the Goldstein Wine Spectator incident to muster a defense of Social Text's being taken in by the Sokal Hoax. For those of you who haven't heard about the Wine Spectator incident, Robin Goldstein invented a fake restaurant with a fake wine list and submitted it to Wine Spectator (along with a $250 fee) and received the "Wine Spectator Award of Excellence". Wine Spectator's self-defense is here. Fundamentally, there are two criticisms one might make of Wine Spectator's process in granting this award:1
  1. That their process relies completely on the claims made by the restaurant owner with no verification.
  2. That the claims made by the owner represent such a weak restaurant that to give it an "award of excellence" implies that the award is next to meaningless.

Goldstein seems to think that the former criticism is more damning, but I'm not so sure. It's certainly true that one can't review the food at a restaurant without tasting it, but it's not like you need to taste the particular wine in my cellar to know if I have a good wine selection: you know what wines are good and what are bad and barring mishandling and statistical variation, if you have a list of the wines I have, you've got a pretty good idea of what the quality is like. Now, it's true that I could be lying about what I have in my cellar, but are we really expecting WS to send around a team of experts to take inventory, verify that I don't have forged wine, and go through my books to make sure I haven't just borrowed better wine from a helpful restaurant down the street run by my brother. That all seems improbable. Obviously, there are limits to the kinds of fraud you can detect and it's not clear to me that given WS's threat model, actually going to the restaurant, having a meal, and maybe looking in the wine cellar would tell their readers much more about the quality of the wine. It's not like getting an award for a restaurant that doesn't exist is useful for defrauding your diners, after all—you pretty much need to actually have a restaurant in order to take their money. [Yes, yes, I could charge a reservation fee, but then people would do credit card chargebacks, which makes Visa unhappy with me.]

On the other hand, it seems quite clear that Wine Spectator can in principle examine a provided wine list and determine whether the wines are good or not. I don't drink wine, so I'm not equipped to come to any conclusion about the quality of the list—as far as I can make out the regular list was pedestrian but the reserve list was all wines that WS had panned. Presumably, if the wine list consisted solely of Charles Shaw and Thunderbird and WS still granted the award, we'd come to the conclusion that the award was meaningless, but as things stand, this doesn't seem to be a pure win for either Goldstein or WS.

This brings us to Fish's article. Fish writes:

Asked what the success of the hoax perpetrated on his magazine demonstrated, Matthews replied, "It has now been demonstrated that an elaborate hoax can deceive Wine Spectator."

The key word is "elaborate"; it speaks to the care with which the intention to deceive is implemented. Sokal (and those he consulted) interwove references to the theorems and experiments of famous scientists with long sentences larded with postmodern jargon, all stitched together by "therefores" that did not hold up in the face of rigorous scrutiny. The question -- one that applies to the Wine Spectator controversy -- is how rigorous a scrutiny should editors of journals and magazines be expected to conduct? In this case, the question is complicated by the fact that one of the editors of the journal Sokal deceived was a colleague of his at N.Y.U., albeit from another department. If someone down the hall or in the next building is sending you something to consider, you don't start by wondering if the submission is on the up and up; like the editors of Wine Spectator, but with even more justification, you might assume that what you have before you is bona fide.

Once again, we're faced with two possible claims, namely:

  1. The editors should have verified that Sokal wasn't lying.
  2. The editors should have determined that the material Sokal submitted was in and of itself bogus.

Obviously, it's true that the editors have no practical way of verifying that paper submissions aren't completely fraudulent. This is especially true of data-based papers where the researchers could just be making up their data. So, I sort of agree with the observation that it was reasonable for the editors to assume Sokal was acting in good faith.

However, if you're going to have a review process, part of that process surely is to try to assess whether or not the paper is at least superficially plausible/internally consistent. I've reviewed papers for conferences and journals and this is exactly what I try to do and expect my fellow referees/PC members to do as well. Otherwise, why bother to have a review process? While I haven't read the Sokal paper, my understanding is that Sokal's claim is that the paper was transparently bogus to someone with any understanding of physics.

Blackburn recalls his own experience as an editor of the journal Mind. He imagines himself receiving a paper from a "well-regarded historian" who claimed that certain issues in Thomas Hobbes's political philosophy could be clarified by "various facts about Hobbes's political experiences in Venice." He would have been able, he says, to assess the political philosophy part of the paper himself, but I "might well have taken Hobbes's presence in Venice as given" on the assumption that any credentialed historian "would not have developed the point if he hadn't gotten that bit right." And, he adds, "I would not have had the history refereed, even if I had known whom to approach."

The reason he wouldn't have thought to have the history refereed is that it was being offered to him by an historian, and was, in effect, already refereed. And even if he had sent the paper to another historian, he would have ended up, he explains, "with two things to judge rather than one," and with no more competence to judge them than he had in the first place. That is, it can't be the case that when you receive a submission from a reputable expert you check everything down to the ground, do all the basic research yourself, something you couldn't do anyway unless you were an expert on just about everything. Sooner or later you would have to rely on the judgment of some learned others. Sooner or later you would be in the position of the Social Text editors who presumed that, at least with respect to the physics part of Sokal's essay, the professional physicist knew what he was talking about. After all, Blackburn concludes - and, remember, he by and large agrees with Sokal's critique of postmodern thought - "you do trust academics to get their own subject right."

I'm not sure that this example is really that helpful. It's certainly true that you can't expect reviewers to go back to ground zero, but to the extent to which the arguments/results in the paper depend on facts from some other field that aren't represented as being well established in that field—and yes to some extent you need to rely on the author to represent that accurately—then you do need to find a reviewer to assess it. Obviously, this is a judgement call on the editor's part, but it's the editor's job to make precisely such judgements.

The argument that "Sooner or later you would have to rely on the judgment of some learned others" seems particularly odd. Editors and program chairs routinely have to decide whether to accept papers that they don't really understand themselves. That's why they send them out for multiple reviews and use those reviews as the basis for their decisions. Obviously, that's not a perfect system: the reviewers can screw up; the editors might not understand the reviews, etc. However, I don't think it's true that just because a paper contains material on a topic that the editor doesn't understand he has to throw up his hands and accept it. And if he can't find anyone to assess that portion, he should probably be asking whether the author has the right venue.2

1 These criticisms are coupled with the suggestion that because WS charges a fee for listing (and encourages you to advertise) that WS does not exactly have an incentive to be selective. I think that's a separable issue, however.

2 Yes, I know that the Social Text wasn't peer reviewed at the time. But I don't see how that absolves the editor of all responsibility to determine whether the paper is meaningful.

 

September 4, 2008

In another entry in the bisphenol-A sweepstakes, Leranth et al. report cognitive impacts in a primate model (abstract only, full text behind paywall) [aside: it's super-annoying how much biomed research gets walled off like this. In CS we've reached this sort of uneasy compromise where people mostly post their papers on their own web sites, but you still have to pay to get the official versions. I suspect there's an interesting story to be told about this cultural divergence...]
Exposure measurements from several countries indicate that humans are routinely exposed to low levels of bisphenol A (BPA), a synthetic xenoestrogen widely used in the production of polycarbonate plastics. There is considerable debate about whether this exposure represents an environmental risk, based on reports that BPA interferes with the development of many organs and that it may alter cognitive functions and mood. Consistent with these reports, we have previously demonstrated that BPA antagonizes spine synapse formation induced by estrogens and testosterone in limbic brain areas of gonadectomized female and male rats. An important limitation of these studies, however, is that they were based on rodent animal models, which may not be representative of the effects of human BPA exposure. To address this issue, we examined the influence of continuous BPA administration, at a daily dose equal to the current U.S. Environmental Protection Agency's reference safe daily limit, on estradiol-induced spine synapse formation in the hippocampus and prefrontal cortex of a nonhuman primate model. Our data indicate that even at this relatively low exposure level, BPA completely abolishes the synaptogenic response to estradiol. Because remodeling of spine synapses may play a critical role in cognition and mood, the ability of BPA to interfere with spine synapse formation has profound implications. This study is the first to demonstrate an adverse effect of BPA on the brain in a nonhuman primate model and further amplifies concerns about the widespread use of BPA in medical equipment, and in food preparation and storage.

As background, this Sigg-sponsored study, which uses pretty extreme conditions (90°ree;C water, 3+ days dwell times) to maximize leaching produces levels of around 70 ppb (110ng/cm^2 of bottle surface) in polycarbonate bottles. The EPA reference ("safe") dose for BPA is 50 ug/kg/day. A 1 liter Nalgene bottle is about 18cm high, which implies a radius (this is a bit tricky to measure with my lousy ruler) of 4.2cm, and an internal surface area of about 600 cm^2 (counting the top and the bottom). At a leaching level of 110ng/cm^2, this comes out to about 66 ug of BPA in a one liter Nalgene bottle. I weigh 75 kg, so even if I drank 10 l of water a day out of a Nalgene bottle, I'd be consuming about a factor of 7 less than the safe dosage. It's also worth noting that the amount of leaching at day 1 was less than a tenth of that at days 2 and 3, so you'd only get this level of exposure if you had a lot of bottles and left your water sitting around in them. Note that I'm not saying that polycarbonate bottles are safe, just trying to get some perspective on what we know about the risk. Obviously, the situation is different for baby bottles and the like because the subject's mass is so much smaller.

Interestingly, the Sigg study shows significant amounts of leaching (19.0 ppb) from "generic aluminum" bottles. This isn't explained, but I suspect that what's going on is that those bottles are lined with a polycarbonate-based plastic (I understand that some aluminum cans are lined that way as well). The Sigg bottles use an (undisclosed) but apparently non-BPA-based lining and so don't leach BPA, which presumably is why they were eager to have such a study.

 

September 2, 2008

One of the big (and IMHO bogus, but more on this in a second) stories in the Beijing olympics was the medal count competition between the US and China. As you may have heard, both sides claimed nominal victories, with the US winning the total medal count and the Chinese winning the most golds. The BBC shows a number of other ways of counting (þ Alex Gregory on Crooked Timber).

I'm just nationalistic enough to have some mild preference for American athletes—if they're reasonably competitive I'll root for them, but I'm not going to sit around cheering for the guy who's 1/2 a lap back (and i admit this isn't rational, but put it down to community spirit)—but I find it pretty hard to get worked up about total medal counts because they're so meaningless. For obvious reasons, it's an edge to have a large population, as well as to spend a lot of money subsidizing sports. China apparenlt spends lavishly on their athletes, and surely if the US spent more, its medal counts would go up as well. It's not clear to me why that would cause me to have more or less national pride than I do today. Maybe we could invent some contest that measured inherent national athletic ability (though, again, unclear why that should be more important to me than, say, mean national height), but the Olympics isn't it.