Over the past several years, a number of large computer security vendors and not-for-profit organizations have developed, promoted, and implemented procedures to rank information system vulnerabilities. Unfortunately, there has been no cohesion or interoperability among these systems. Also, existing systems tend to be limited in scope as to what they cover. Finally, all of these systems tend to be Internet-centric; that is, they tend to be concerned only with vulnerabilities affecting computers connected to the worldwide Internet. The NIAC commissioned this project to propose an open and universal vulnerability scoring system to address and solve these shortcomings, with the ultimate goal of promoting a common understanding of vulnerabilities and their impact.
To get the CVSS score for a given vulnerability, you give it individual scores along a number of axes, e.g.:
- Access vector (local, remote, ...)
- Access complexity (high, low)
- Integrity impact (none, partial, complete).
CVSS then specifies an algorithm to aggregate all of this individual scores into a single linear score, which presumably gives you some impression of the severtity of the vulnerability.
I certainly agree that it's useful to have a common nomenclature and system for describing the characteristics of any individual vulnerability, but I'm fairly skeptical of the value of the CVSS aggregation formula. In general, it's pretty straightforward to determine linear values for each individual axis, and all other things being equal, if you have a vulnerability A which is worse on axis X than vulnerability B, then A is worse than B. However, this only gives you a partial ordering of vulnerability severity. In order to get a complete ordering, you need some kind of model for overall severity. Building this kind of model requires some pretty serious econometrics.
CVSS does have a formula which gives you a complete ordering but the paper doesn't contain any real explanation for where that formula comes from. The weighting factors are pretty obviously anchor points (.25, .333, .5) so I'm guessing they were chosen by hand rather than by some kind of regression model. It's not clear, at least to me, why one would want this particular formula and weighting factors rather than some other ad hoc aggregation function or just someone's subjective assessment.

I think the situation is actually worse than the ad hoc aggregation function. Without an econometric analysis, it's not even clear that their scores along individual axes offer any predictive power for the expected disutility of potential compromises.
What they really need is an economic model of the monetary losses from different types of compromises, a predictive model of the conditional probability of each type of compromise given a bundle of vulnerabilities, and data from companies on incidences of compromises and damages according to the model. Each of these would involve a lot of work.
Eric and Kevin: Obviously, a more nuanced model is needed. However, when data on the dependent variable aren't available, it is a bit tough to do much empirical work.
As far as I know (which might not be very far...) only Campell, Gordon, Loeb, and Zhou have worked on this and that was a univariate model (implicitly bivariate, I guess one could say). Due to data limitations, it is simply not possible at this time to specify (and test!) a model of the complexity Kevin Dick outlines. Perhaps when mandatory reporting allows some of these data gaps to be filled we will have better models. Perhaps not. In any case, until a theoretical basis for a purported explanation can be proffered ex ante, it'd be easy to argue that models which "account for" a large amount of the variability in the dependent variable were the result of data mining. I'm not sure we want to go there.
Not sure of your point Chris. Mine was that we should be expending our efforts trying to get the data rather throwing up our hands and just making something up. I don't even think it's clear that CVSS is better than nothing.
As for your data mining comment, that's a pretty unsophisticated view. There are a bunch of well-defined statistical learning techniques for pruning a large set of candidate predictors into a regression model. There's no need for a well established theoretical basis, only a plausible rationale. As long as the resulting models have predictive value for subsequent occurences of the response variable, the accusation of data mining has little practical force.
Kevin: I wholeheartedly agree about attempting to get the data. I don't think information will become available, however, w/out legislation.
"Well-established theoretical basis" vs. "plausible rationale" boils down to a question of the philosophy of explanation in the social sciences.
Don't know if legislation is required. Econometricians often come up with clever ways to measure things. Seems like hiring one as a consultant would be a good use of this group's resources.
As for philosphy, if I can make predictions with high fidelity, the philosophical questions are moot.