Recently in Software Category

 

June 10, 2010

Alfred Renyi famously said "A mathematician is a device for turning coffee into theorems." (actually Paul Erdos famously said it, but according to Wikipedia it's actually Renyi). I'd long believed (and thought the evidence showed) that caffeine improved concentration and hence productivity. Now Rogers et al. have come along and spoiled everything:
Caffeine, a widely consumed adenosine A1 and A2A receptor antagonist, is valued as a psychostimulant, but it is also anxiogenic. An association between a variant within the ADORA2A gene (rs5751876) and caffeine-induced anxiety has been reported for individuals who habitually consume little caffeine. This study investigated whether this single nucleotide polymorphism (SNP) might also affect habitual caffeine intake, and whether habitual intake might moderate the anxiogenic effect of caffeine. Participants were 162 non-/low (NL) and 217 medium/high (MH) caffeine consumers. In a randomized, double-blind, parallel groups design they rated anxiety, alertness, and headache before and after 100 mg caffeine and again after another 150 mg caffeine given 90 min later, or after placebo on both occasions. Caffeine intake was prohibited for 16 h before the first dose of caffeine/placebo. Results showed greater susceptibility to caffeine-induced anxiety, but not lower habitual caffeine intake (indeed coffee intake was higher), in the rs5751876 TT genotype group, and a reduced anxiety response in MH vs NL participants irrespective of genotype. Apart from the almost completely linked ADORA2A SNP rs3761422, no other of eight ADORA2A and seven ADORA1 SNPs studied were found to be clearly associated with effects of caffeine on anxiety, alertness, or headache. Placebo administration in MH participants decreased alertness and increased headache. Caffeine did not increase alertness in NL participants. With frequent consumption, substantial tolerance develops to the anxiogenic effect of caffeine, even in genetically susceptible individuals, but no net benefit for alertness is gained, as caffeine abstinence reduces alertness and consumption merely returns it to baseline.

Roughly speaking, this paper says that if you don't use caffeine, taking it won't make you more alert. If you do use it, it will make you more alert but only because you're less alert due to caffeine withdrawal and taking it brings you back up to normal.

What's most surprising here is the result that caffeine doesn't improve alertness in non-users. This contradicts previous work which shows an improvement in alertness from caffeine consumption by non-users. The authors propose one explanation for this might be that people are reporting low/no usage of caffeine when they are actually using it at higher levels (the 40 mg/day level cutoff here between low and moderate is actually quite low; coffee contains something like 100mg/cup.) So, when you force withdrawal and then dose with caffeine you get an improvement in alertness. This is partly borne out by their measurements of caffeine levels in "non-users" which are actually modestly high. However, this seems like it would benefit from more study.

However, it appears that once you are already a regular caffeine user, you do get some benefit from caffeine, in that it restores normal function. So, it's not crazy to take it once you're a user. However, it appears that you could get an equivalent benefit from just abstaining entirely and then (maybe) using caffeine when you needed to be alert (assuming you don't believe the non-user result). Of course if you're a user, you'll have to withdraw, which isn't a lot of fun.

One thing I should note is that the instrument this paper uses is a direct measure of (subjective) perceived alertness. The authors also had subjects do a variety of tasks that presumably required alertness. Those results don't appear in this paper, so it could be that they show improvement in non-users: i.e., they don't feel more alert when taking caffeine but they are more effective, which would make consumption worthwhile. I look forward to the publication of that data.

 

April 10, 2010

One of the great things about C++ is that it turns simple typographical errors into an exercise in language hermeneutics. Consider, for example, the following code fragment:
1  #include <boost/shared_ptr.hpp>

3  class Clazz {
4  public:
5    int member_;
6  };


9  void bar(void)
10 {
11   boost::shared_ptr < Clazz > cl(Clazz());

13   cl->member_ = 9;
14 }

This code doesn't compile, however; you get the following error (reformatted a bit for presentation)

/tmp/cpp.cpp: In function 'void bar()':
/tmp/cpp.cpp:13: error: request for member 'member_' in 'cl', which is
of non-class type 'boost::shared_ptr ()(Clazz (*)())'

Now, I've cleaned this code up so that it isolates the error. The original code had the error buried in a sea of boost::variant and boost::bind error messages that took up about half a page. Even so, when I showed this code to a very experienced C++ and Boost programmer, he had the same reaction I (and one of my other colleagues had when we looked at the original code), namely, WTF. The problem with the code is that I've forgotten the new when constructing the object on line 11, but the compiler accepts that line just fine. However, the compiler chokes on line 13, not 11. Anyway, all three of us have the same reaction: sure line 11 is broken and the compiler should have complained, but given that it accepted it, how can this possibly screw up the perfectly unobjectionable reference to the member variable member_ in the class Clazz, indirected through a boost::shared_ptr.

At this point, we were seriously considering the possibility that it was a compiler bug, but after about 20 minutes of headscratching and trying different variants of the code, we finally paid attention to the error message that g++ was spitting out, which, when you actually look at it, is kind of clear. Despite appearances, this isn't a misinitialized boost::shared_ptr to Clazz. Instead, it's a shared function pointer to a function which takes a function pointer to a function returning Clazz (I think... I don't have a copy of c++decl handy).

As I said, this error appeared in some of my real code. It's probably not that common a mistake, but I've been working in Python as well as C++ and in Python you don't use new with constructors. In my experience this kind of error is a pretty common consequence of flipping back and forth between languages—my Python code is riddled with spurious (and luckily harmless) semicolons. And of course, C's (and by extension C++'s) inside-out syntax for function pointer declarations turns line 11 into legal, albeit obscure, syntax, letting the error trickle down to line 13, instead of just reporting a syntax error at the point where I actually made the mistake.

UPDATE: fixed various things in <> that render funny. Thanks to Hovav Shacham for pointing out the HTML errors. Grr.

 

November 7, 2009

I happened to be leafing through Stroustrup and noticed that you can overload <code>< and >. This motivated me to write the following program:
#include <iostream>
#include <vector>
typedef int UINT4;

using namespace std; 

class Hack 
{
};

Hack & operator< (Hack &a , Hack &b)
{
 std::cerr << "LT operator\n";

 return a;
}

Hack & operator> (Hack &a, Hack &b)
{
 std::cerr << "RT operator\n";

 return a;
}


int main(int argc, char ** argv)
{
 Hack vector;
 Hack UINT4;
 Hack foo;
 
 vector<UINT4> foo;
 
 return(0);
}

Ask yourself what this code does.

The answer is that it outputs:

LT operator
RT operator

If you focus just on the line vector<UINT4> foo; this looks like a relatively ordinary template instantiation of a vector of type UINT4. This is perfectly normal C++ stuff. However if we expand the scope, it becomes clear that something different is going on: we've defined a new class called Hack and vector, UINT4, and foo are actually objects of type Hack. We've also overloaded the < and > operators. So, what's actually happening here is that we are doing function chaining: We perform operator > on the pair of objects UINT4 and foo. This returns a temporary object of type Hack (in this case the first argument but it doesn't matter). We then perform operator < on set and the temporary variable. And of course since these operators are just function calls, we can do any work we want in them. The examples print stuff to stderr, but that's just an example; you could do anything. And of course this code was written to be moderately transparent while making the point. You could obfuscate it much further with a little effort.

Outstanding!

 

Acknowledgment: Steve Checkoway pointed out to me that whatever crazy type resolution rules C++ follows here make the code work even with the definition of vendor and UINT4 at the top of file. My original version didn't have these and so alleged vector declaration in main wasn't really valid without the definition of Hack

UPDATE Oh great. HTML screws up anything with <foo>. Fixed now.

 

September 23, 2009

Nominum is introducing a new "cloud" DNS service called Skye. Part of their pitch for this service is that it's supposedly a lot more secure. Check out this interview with Nominum's John Shalowitz where he compares using their service to putting fluoride in the water:
In the announcement for Nominum's new Skye cloud DNS services, you say Skye 'closes a key weakness in the internet'. What is that weakness?

A: Freeware legacy DNS is the internet's dirty little secret - and it's not even little, it's probably a big secret. Because if you think of all the places outside of where Nominum is today - whether it's the majority of enterprise accounts or some of the smaller ISPs - they all have essentially been running freeware up until now.

Given all the nasty things that have happened this year, freeware is a recipe for problems, and it's just going to get worse.

...

What characterises that open-source, freeware legacy DNS that you think makes it weaker?

Number one is in terms of security controls. If I have a secret way of blocking a hacker from attacking my software, if it's freeware or open source, the hacker can look at the code.

By virtue of something being open source, it has to be open to everybody to look into. I can't keep secrets in there. But if I have a commercial-grade software product, then all of that is closed off, and so things are not visible to the hacker.

By its very nature, something that is freeware or open source [is open]. There are vendors that take a freeware product and make a slight variant of it, but they are never going to be ever able to change every component to lock it down.

Nominum software was written 100 percent from the ground up, and by having software with source code that is not open for everybody to look at, it is inherently more secure.

First, I should say that I don't have any position on the relative security of Nominum's software versus the various open source DNS products. With that said, I'm not really that convinced. The conventional argument goes that it's harder for attackers to find vulnerabilities in closed source software because it's harder to work with the binaries than the source. This is a proposition which I've seen vigorously argued but for which there isn't much evidence. Now, it's certainly true that if nobody can get access to your program at all, then it's much harder to figure out how it works and how to attack it. However, Nominum does sell DNS software, so unless the stuff they're running on Skye is totally different, it's not clear how much of an advantage this is.

Salowitz also argues that being closed source lets him hide "secret way[s] of blocking a hacker from attacking my software". This seems even less convincing, primarily because it's not really clear that such techniques exist; there's been a huge amount of work on software attack and defense in the public literature, so how likely is it that Nominum has really invented something fundamentally new? And if you did in fact have such a technique, but one that's only secure as long as it's secret, then it's far more vulnerable to reverse engineering than programs ordinarily are, since the attacker just needs to reverse engineer it once and it's insecure forever. By contrast, if they reverse engineer your program to find a vulnerability, you can close that vulnerability and then they need to find a new one.

Again, this isn't to say that Nominum's system is or isn't more secure than other DNS servers (though DJBDNS, for instance, has a very good reputation). I don't have any detailed information one way or the other. However, this particular argument doesn't seem to me to establish anything useful.

 

August 2, 2009

Mrs. Guesswork is flying in from Stockholm today, scheduled to arrive tonight around 10. You can't trust the schedules on transcon flights, so I check things out on the Delta site, which tells me it's an hour late, currently over Colorado and due in at 11:09. No problem, I'll watch Anthony Bourdain for a while and then head over. Around 9:15 I check again and (gulp!) it's now on time. Planes don't fly that fast, but it's not at all out of the question that Delta just screwed up here, so I'll just head over.

Right before I leave for SFO, I check again. The flights still on time, but then I notice something screwy: the flight is dated August 3rd, not August 2nd. I go back to the main page where you enter the flight #, and here's what it offers me:

  • Yesterday Aug 02
  • Today Aug 03
  • Tomorrow Aug 04

At this point it should be obvious what happened: Delta is based in Georgia, and in Georgia it's tomorrow, so naturally the site decided that's what I was interested in, despite the fact that that flight takes off something like 19 hours and today's flight is actually in the freaking air. Outstanding!

 

July 9, 2009

Andy Zmolek of Avaya reports on VoIP security research company VoIPshield's new policy requiring vendors to pay for full details of bugs in their products. He quotes from a letter VoIPShield sent him:
"I wanted to inform you that VoIPshield is making significant changes to its Vulnerabilities Disclosure Policy to VoIP products vendors. Effective immediately, we will no longer make voluntary disclosures of vulnerabilities to Avaya or any other vendor. Instead, the results of the vulnerability research performed by VoIPshield Labs, including technical descriptions, exploit code and other elements necessary to recreate and test the vulnerabilities in your lab, is available to be licensed from VoIPshield for use by Avaya on an annual subscription basis.

"It is VoIPshield's intention to continue to disclose all vulnerabilities to the public at a summary level, in a manner similar to what we've done in the past. We will also make more detailed vulnerability information available to enterprise security professionals, and even more detailed information available to security products companies, both for an annual subscription fee."

In comments, Rick Dalmazzi from VoIPshield responded at length. Quoting some of it:

VoIPshield has what I believe to be the most comprehensive database of VoIP application vulnerabilities in existence. It is the result of almost 5 years of dedicated research in this area. To date that vulnerability content has only been available to the industry through our products, VoIPaudit Vulnerability Assessment System and VoIPguard Intrusion Prevention System.

Later this month we plan to make this content available to the entire industry through an on-line subscription service, the working name of which is VoIPshield "V-Portal" Vulnerability Information Database. There will be four levels of access (casual observer; security professional; security products vendor; and VoIP products vendor), each with successively more detailed information about the vulnerabilities. The first level of access (summary vulnerability information, similar to what's on our website presently) will be free. The other levels will be available for an annual subscription fee. Access to each level of content will be to qualified users only, and requests for subscription will be rigorously screened.

So no, Mr. Zmolek, Avaya doesn't "have to" pay us for anything. We do not "require" payment from you. It's Avaya's choice if you want to acquire the results of years of work by VoIPshield. It's a business decision that your company will have to make. VoIPshield has made a business decision to not give away that work for free.

It turns out that the security industry "best practice" of researchers giving away their work to vendors seems to work "best" for the vendors and not so well for the research companies, especially the small ones who are trying to pioneer into new areas.

As a researcher myself—though in a different area—I can certainly understand Dalmazzi's desire to monetize the results of his company's research. One of my friends used to quote Danny DeVito from Heist on this point: "Everybody needs money. That's why they call it money." That said, I think his defense of this policy elides some important points.

First, security issues are different from ordinary research results. Suppose, for instance, that Researcher had discovered a way to significantly improve the performance of Vendor's product. They could tell Vendor and offer to sell it to them. At this point, Vendor's decision matrix would look like this:

Not BuyBuy
0V - C

Where V is the value of the performance improvement to them and C is the price they pay to Researcher for the information. Now, if Researcher is willing to charge a low enough price, they have a deal and it's a win-win. Otherwise, Vendor's payoff is zero. In no case is Vendor really worse off.

The situation with security issues is different, however. As I read this message, Researcher will continue to look for issues in Vendor's products regardless of whether Vendor pays them. They'll be disclosing this vulnerabilities in progressively more detail to people who pay them progressively more money. Regardless of what vetting procedure Researcher uses (and "qualified users" really doesn't tell us that much, especially as "security professional" seems like a pretty loose term), the probability that potential attackers will end up in possession of detailed vulnerability information seems pretty high. First, information like this tends to leak out. Second, even a loose description of where a vulnerability is in a piece of software really helps when you go to find it for yourself, so even summary information increases the chance that someone will exploit the vulnerability. We need to expand our payoff matrix as follows:

Not BuyBuy
Not Disclose0V - C
Disclose-D?

The first line of the table, corresponding to a scenario in which Researcher doesn't disclose the vulnerability to anyone besides Vendor, looks the same as the previous payoff matrix: Vendor can decide whether or not to buy the information depending on whether it's worth it to them or not to fix the issue [and it's quite plausible that it's not worth it to them, as I'll discuss in a minute.] However, the bottom line on the table looks quite different: if Researcher discloses the issue, then this increases the chance that someone else will develop an exploit and attack Vendor's customers, thus costing Vendor D. This is true regardless of whether or not Vendor chooses to pay Researcher for more information on the issue. If Vendor chooses to pay Researcher, they get an opportunity to mitigate this damage to some extent by rolling out a fix, but their customers are still likely suffering some increased risk due to the disclosure. I've marked the lower right (Buy/Disclose) cell with a ? because the costs here are a bit hard to calculate. It's natural to think it's V - C - D but it's not clear that that's true, since presumably knowing the details of the vulnerability is of more value if you know it's going to be released—though by less than D, since you'd be better off if you knew the details but nobody else did. In any case, from Vendor's perspective the top row of the matrix dominates the bottom row.

The point of all this is that the situation with vulnerabilities is more complicated: Researcher is unilaterally imposing a cost on Vendor by choosing to disclose vulnerabilities in their system and they're leaving it up to Vendor whether they would like to minimize that cost by paying Researcher some money for details on the vulnerability. So it's rather less of a great opportunity to be allowed to pay for vulnerability details than it is to be offered a cool new optimization.

The second point I wanted to make is that Dalmazzi's suggetion that VoIPshield is just doing Avaya's QA for them and that they should have found this stuff through their own QA processes doesn't really seem right:

Final note to Mr. Zmolek. From my discussions with enterprise VoIP users, including your customers, what they want is bug-free products from their vendors. So now VoIP vendors have a choice: they can invest in their own QA group, or they can outsource that function to us. Because in the end, a security vulnerability is just an application bug that should have been caught prior to product release. If my small company can do it, surely a large, important company like Avaya can do it.

All software has bugs and there's no evidence that it's practical to purge your software of security vulnerabilities by any plausible QA program, whether that program consists of testing, code audits, or whatever. This isn't to express an opinion on the quality of Avaya's code, which I haven't seen; I'm just talking about what seems possible given the state of the art. With that in mind, we should expect that with enough effort researchers will be able to find vulnerabilities in any vendor's code base. Sure, the vendor could find some vulnerabilities too, but the question is whether they can find enough bugs that researchers can't find any. There's no evidence that that's the case.

Finally, I should note that from the perspective of general social welfare, disclosing vulnerabilities to a bunch of people who aren't the vendor but not the vendor seems fairly suboptimal. The consequence is that there's a substantial risk of attack which the vendor can't mitigate. Of course, this isn't the researcher's preferred option—they would rather collect money from the vendor as well—but if they have to do it occasionally in order to maintain a credible negotiating position, that has some fairly high negative externalities. Obviously, this argument doesn't apply to researchers who always give the vendor full information. There's an active debate about the socially optimal terms of disclosure, but I think it's reasonably clear that a situation where vulnerabilities are frequently disclosed to a large group of people but not to the vendors isn't really optimal.

Acknowledgement: Thanks to Hovav Shacham for his comments on this post.

 

May 13, 2009

USENIX conferences use this tool called HotCRP to manage the conference review process. Like other systems, you rate papers on a numeric scale (1-5). When you ask for a summary of the papers, the system displays a cute little graphic of how many people have chosen each rating (and even a cute little mouseover that displays the mean and SD), but once you have more than a few papers to look, it's a bit inconvenient to get a sense of the distribution. Maybe the PC chairs have a tool, but PC members don't. Luckily, it's easy to extract it from the HTML source. Here's a Perl script that will suck out the scores and compute the mean:
#!/usr/bin/perl
while(){
    next unless /GenChart\?v=([\d,]+)/;
    
    @scores=split(/,/,$1);
    
    $sum = 0;
    $ct = 0;

    for($i=0; $i<=$#scores; $i++){
	$sum += ($i + 1) * $scores[$i];
	$ct += $scores[$i];
    }

    $mean = $sum / $ct;
    
    print "$mean\n";
}

You just do save as1 and shove the file into the script on stdin. Extracting standard deviations and the name of the paper are left as exercises for the reader.

1. Note: you need to save as source not a complete Web page/directory. Otherwise the browser helpfully saves the images and rewrites the links to point to your local disk, which breaks everything. Took me a while to figure out what the heck was going on with that one.

 

May 12, 2009

Ed Felten posts about the Minnesota Breathlyzer case (I've written about it here):
The problem is illustrated nicely by a contradiction in the arguments that CMI and the state are making. On the one hand, they argue that the machine's source code contains valuable trade secrets -- I'll call them the "secret sauce" -- and that CMI's business would be substantially harmed if its competitors learned about the secret sauce. On the other hand, they argue that there is no need to examine the source code because it operates straightforwardly, just reading values from some sensors and doing simple calculations to derive a blood alcohol estimate.

It's hard to see how both arguments can be correct. If the software contains secret sauce, then by definition it has aspects that are neither obvious nor straightforward, and those aspects are important for the software's operation. In other words, the secret sauce -- whatever it is -- must relevant to the defendants' claims.

I'm not sure this argument is right in the general case. Ignoring the specific case of breathalyzers, if I want to develop a new piece of software, it's pretty helpful to have a worked example to rip off. To take a simple case, if I wanted to build a new NAT (a pretty well-understood technology) I'd rather start with some existing package than build everything myself. It's not that there is anything secret in one of these gizmos, just that it would give you something to imitate/test against, etc. This would be especially true if I could actually copy the source, not just mimic it. Conversely, if I were the vendor of an existing system, I wouldn't necessarily want to assist my competitors.

Three further observations: First, I expect it's a lot less of an advantage to have the source code for a device like a breathalyzer or a voting machine. First, it's not a generic PC wired to a bunch of network ports: there's a bunch of sensors and stuff that can't be sourced from your average OEM network gear manufacturing plant (this is more true for breathalyzers than voting machines). Second, a lot of the business of selling something like this is engaging with law enforcement, voting officials, etc. There's more too it than just getting your boxes on the shelf at Fry's. Consequently, it's probably not as much of a competitive advantage to save on engineering costs as it might be in some other business.

Second, if every breathalyzer vendor is required to disclose their source code, it makes it a fair bit harder for your competitors to just steal your source code, since, at least potentially, you can see their source code and have an opportunity to demonstrate that it's a copy of yours. Of course, this doesn't rule out less blatant copying, using the original system as a template/regression test system, etc.

Third, we're kind of stretching the definition of "trade secret" here, at some abstract level. As Ed observes, if the system is straighforward, what's the secret? On the other hand, it's fairly consistent with the relatively expansive tech industry definition of trade secret.

 

May 3, 2009

The Minnesota Supreme Court has ruled that defendants in DUI cases can get discovery of breathalyzer source code. (Ruling here). Apparently this puts a pretty serious crimp in Minnesota DUI proceedings because the manufacturer won't provide the source code:
The state's highest court ruled that defendants in drunken-driving cases have the right to make prosecutors turn over the computer "source code" that runs the Intoxilyzer breath-testing device to determine whether the device's results are reliable.

But there's a problem: Prosecutors can't turn over the code because they don't have it.

The Kentucky company that makes the Intoxilyzer says the code is a trade secret and has refused to release it, thus complicating DWI prosecutions.

"There's going to be significant difficulty to prosecutors across the state to getting convictions when we can't utilize evidence to show the levels of the defendant's intoxication," said Dakota County Attorney James Backstrom.

"In the short term, it's going to cause significant problems with holding offenders accountable because of this problem of not being able to obtain this source code."

I can't find the original filings, which include an affidavit from David Wagner, so I'm not sure I'm seeing the best argument for this position. That said, however, I'm not sure that source code analysis is really the best way to determine whether breathalyzers are accurate.

At a high level a breathalyzer is a sensor apparently either an IR spectrometer or some sort of electrochemical fuel cell gizmo attached to a microprocesser and a display. The microprocessor reads the output of the sensor, does some processing, and emits a reading. Obviously, there are a lot of things that can go wrong here, and this page describes a bunch of problems in the source code of another machine, mostly that there seems to be a bunch of ad hoccery in the way the measurements are handled. For instance:

3. Results Limited to Small, Discrete Values: The A/D converters measuring the IR readings and the fuel cell readings can produce values between 0 and 4095. However, the software divides the final average(s) by 256, meaning the final result can only have 16 values to represent the five-volt range (or less), or, represent the range of alcohol readings possible. This is a loss of precision in the data; of a possible twelve bits of information, only four bits are used. Further, because of an attribute in the IR calculations, the result value is further divided in half. This means that only 8 values are possible for the IR detection, and this is compared against the 16 values of the fuel cell.

So, maybe this is bad and maybe it isn't. But it's not clear that you can determine the answer by examining the source code. Rather, you want to ask what the probability is that a system constructed this way would produce an inaccurate reading. If, for instance, the A/D converters have an inherent error rate/variance that's large compared to the sensitivity that they read out in, then it's not crazy to divide down to some smaller number of significant digits—though I might be tempted to do it later in the process. More to the point, any piece of software you look at closely is going to be chock full of errors of various kinds, but it's pretty hard to tell whether they are going to actually impact performance without some careful analysis.

On the flip side, actually reading the source code is a pretty bad way of finding errors. First, it's not very efficient in terms of finding bugs. I've written and reviewed a lot of source code and it's just really hard to get any but the most egregious bugs out with that kind of technique. Second, even if we find things that could have gone wrong (missed interrupts, etc.) it's very hard to determine whether they caused problems in any particular case. [Note that you could improve your ability to recover from some kinds of computational error by logging the raw data as well as whatever readings the system produces.] Third: there are a lot of non-software things that can go wrong. In particular, you need to establish that what the sensors is are reading actually correspond to the alcohol level in the breath, that that actually corresponds to blood alcohol level, that the sensors are reading accurately, etc.

Stepping up a level, it's not clear what our policy should be about how to treat evidence from software-based systems; all software contains bugs of one kind or another (and we haven't even gotten to security vulnerabilities yet). If that's going to mean that all software-based systems are useless for evidentiary purposes, the world is going to get odd pretty fast.

 

April 18, 2009

It's conference submission time ( EVT/WOTE 2009) and along with conference submission time comes its friend, fighting with LaTeX time. The big problems I usually have are avoiding bad breaks and convincing LaTeX's broken float algorithm to put my figures (I like figures) where I want them instead of three pages later. Anyway, I recently ran into a problem (on a friends paper, not my own) with a long author list. What we wanted was to have an author list with a separate affiliation list and then a footnoted contact address, like so (click to see a PDF):

LaTeX's built-in \author mode is pretty lame, but Authblk lets you use "author block" mode, with separate author names and affiliations and footnote-style superscripted numbers to connect the two. The code you want is:

\author[1,2]{Charles Kinbote}
\author[1]{John Shade}
\author[1]{Charles Xavier Vseslav}
\author[3]{Humbert Humbert}
\author[4]{Clare Quilty}


\affil[1]{Kingdom of Zembla}
\affil[2]{Wordsmith College}
\affil[3]{Independent}
\affil[4]{Beardsley Women's College}

But this is only a partial solution because it doesn't give you the footnote with the author's address. If you're willing to have the author's address attached to the affiliation block, you can just do a separate affiliation that contains the email address of the author:

\author[1,2,*]{Charles Kinbote}
\author[1]{John Shade}
\author[1]{Charles Xavier Vseslav}
\author[3]{Humbert Humbert}
\author[4]{Clare Quilty}


\affil[1]{Kingdom of Zembla}
\affil[2]{Wordsmith College}
\affil[3]{Independent}
\affil[4]{Beardsley Women's College}
\affil[*]{To whom correspondence should be addressed. Email: \url{kinbote@example.com}}

This does work, but it looks pretty terrible. You can attach a footnote to the author's name as a footnote, but this isn't quite what you want either, for two reasons. First, the asterisk shows up after the name, before the superscripted affiliation numbers, when you really want it afterwards. Second, it's on the baseline of the affiliation numbers, when you really want it aligned with the top of the numbers.

What you need is a combination strategy: you use the fake affiliation with an asterisk, but don't provide a \affil block. This just creates a bare asterisk superscript, but no footnote. To create the footnote, you need to use \footnotetext. Unfortunately, if you just use \footnotetext, you end up with a numeric marker attached to the footnote text at the bottom of the page. What you want is an asterisk. To get this to work, you need to override the footnote style with \renewcommand{\thefootnote}{\fnsymbol{footnote}}, and then reset it so that you get numeric footnotes elsewhere:

\let\oldthefootnote\thefootnote
\renewcommand{\thefootnote}{\fnsymbol{footnote}}
\footnotetext[1]{To whom correspondence should be addressed. Email: \url{kinbote@example.com}}
\let\thefootnote\oldthefootnote

Putting it all together:


\author[1,2,*]{Charles Kinbote}
\author[1]{John Shade}
\author[1]{Charles Xavier Vseslav}
\author[3]{Humbert Humbert}
\author[4]{Clare Quilty}


\affil[1]{Kingdom of Zembla}
\affil[2]{Wordsmith College}
\affil[3]{Independent}
\affil[4]{Beardsley Women's College}

\pagestyle{empty}

\begin{document}
\maketitle
\thispagestyle{empty}

\let\oldthefootnote\thefootnote
\renewcommand{\thefootnote}{\fnsymbol{footnote}}
\footnotetext[1]{To whom correspondence should be addressed. Email: \url{kinbote@example.com}}
\let\thefootnote\oldthefootnote

Have fun.

Acknowledgement: Body text from the Lorem Ipsum Generator.