Note that I'm not trying to make any claims about what the best set of venues is. It's obviously easy to figure out any statistic we want about each proposed venue, but how you map that data to "best" is a much more difficult problem. The space is full of Pareto optima, and even if we ignore the troubling philosophical question of interpersonal utility comparisons, there's some tradeoff between minimal total travel time and a "fair" distribution of travel times (or at least an even distribution).
METHODOLOGY
The data below is derived by treating both people and venues as
airport locations and using travel time as our primary instrument.
- For each responder for the current Doodle poll, assign a home airport based on their draft publication history. We're missing a few people but basically it should be pretty complete. Since these people responded before the venue is known, it's at least somewhat unbiased.
- Compute the shortest advertised flight between each home airport and the locations for each venue by looking at the shortest advertised Kayak flights around one of the proposed interim dates (6/10 - 6/13), ignoring price, but excluding "Hacker fares". [Thanks to Martin Thomson or helping me gather these.]
This lets us compute statistics for any venue and/or combination of venues, based on the candidate attendee list.
The three proposed venues:
- San Francisco (SFO)
- Boston (BOS)
- Stockholm (ARN)
Three hubs not too distant from the proposed venues:
- London (LHR)
- Frankfurt (FRA)
- New York (NYC) (treating all NYC airports as the same location)
RESULTS
Here are the results for each of the above venues, measured in total
hours of travel (i.e., round trip).
Venue Mean Median SD ---------------------------------------------- SFO 13.5 11 12.2 BOS 12.3 11 7.5 ARN 17.0 21 10.7 FRA 14.8 17 7.3 LHR 13.3 14 7.5 NYC 11.5 11 5.8 YYC 14.9 13 10.2 SFO/BOS/ARN 14.3 13 3.6 SFO/NYC/LHR 12.7 11.3 3.7XXX/YYY/ZZZ is a three-way rotation of XXX, YYY, and ZZZ. Obviously, mean and median are intended to be some sort of aggregate measure of travel time. I don't have any way to measure "fairness", but SD is intended as some metric of the variation in travel time between attendees.
The raw data and software are attached. The files are:
home-airports: the list of people's home airports
durations.txt: the list of airport-airport durations
doodle.txt: the attendees list
pairings: the software to compute travel times
doodle-out.txt -- the computed travel times for each attendee
This was a quick hack, so there may be errors here, but nobody has pointed out any yet.
OBSERVATIONS
Obviously, it's hard to know what the optimal solution is without
some model for optimality, but we can still make some observations
based on this data:
Obviously, your mileage may vary based on your location and feelings about what's fair, but based on this data, it looks to me like a three-way rotation between West Coast, East Coast, and European hubs offers a good compromise between minimum cost and a flat distribution of travel times.