Extracting scores from HotCRP

| Comments (0) | Software
USENIX conferences use this tool called HotCRP to manage the conference review process. Like other systems, you rate papers on a numeric scale (1-5). When you ask for a summary of the papers, the system displays a cute little graphic of how many people have chosen each rating (and even a cute little mouseover that displays the mean and SD), but once you have more than a few papers to look, it's a bit inconvenient to get a sense of the distribution. Maybe the PC chairs have a tool, but PC members don't. Luckily, it's easy to extract it from the HTML source. Here's a Perl script that will suck out the scores and compute the mean:
#!/usr/bin/perl
while(){
    next unless /GenChart\?v=([\d,]+)/;
    
    @scores=split(/,/,$1);
    
    $sum = 0;
    $ct = 0;

    for($i=0; $i<=$#scores; $i++){
	$sum += ($i + 1) * $scores[$i];
	$ct += $scores[$i];
    }

    $mean = $sum / $ct;
    
    print "$mean\n";
}

You just do save as1 and shove the file into the script on stdin. Extracting standard deviations and the name of the paper are left as exercises for the reader.

1. Note: you need to save as source not a complete Web page/directory. Otherwise the browser helpfully saves the images and rewrites the links to point to your local disk, which breaks everything. Took me a while to figure out what the heck was going on with that one.

Leave a comment