Code analysis and safe languages

| Comments (8) | TrackBacks (10) |
Amit Yoran says that we need to have better tools for finding bugs in code:
About 95% of software bugs come from 19 "common, well-understood" programming mistakes, Yoran said, and his division pushed for automation tools that comb software code for those mistakes.

"Today's developers ... often times don't have the academic discipline of software engineering and software development and training around what characteristics would create flaws in the program or lead to bugs," Yoran said.

Government research into some such tools is in its infancy, however, he added. "This cycle will take years if not decades to complete," he said. "We're realistically a decade or longer away from the fruits of these efforts in software assurance."

There are already a number of such tools available, including MOPS, Splint, and SWAT. These tools aren't perfect and it certainly would be nice to have better tooling, but it's worth noting that a lot of the bugs they find are the kind of thing that could be entirely eliminated if people would just program in safer languages. For instance, the buffer overflow vulnerabilities which have been so troublesome to eliminate in C/C++ code are basically a non-problem with Java.

10 TrackBacks

Listed below are links to blogs that reference this entry: Code analysis and safe languages.

TrackBack URL for this entry: http://www.educatedguesswork.org/cgi-bin/mt/mt-tb.cgi/21

Ekr writes: These tools aren't perfect and it certainly would be nice to have better tooling, but it's worth noting that a lot of the bugs they find are the kind of thing that could be entirely eliminated if... Read More

If you're doing philosophy at all, if you're engaged in the way that ideas work, then it's a male peculiarity to wish to go right up in the air and go round in circles without relating them to anythin Read More

tylenol 2 from alignment.orgfree.com on July 14, 2005 11:17 AM

tylenol and alcohol overdose pills tylenol tylenol pm before surgery infant tylenol dosage 1st of 5 cyanide-laced tylenol victims dies health tylenol overdose will tylenol get you high tylenol pediatric dosing tylenol 3 with nursing tylenol toxicity in... Read More

java black jack from java black jack on August 10, 2005 3:24 PM

java black jack Read More

double reverse gang bang from lesbian gang bang movies on September 9, 2005 7:10 PM

girls gang bang gay, anal sex, blow jobs, gang bangs cheerleader gang bangs female soldiers supposedly had sex in a gang bang gang bang pics gang bangs sex gay men gang bang gang bang wife stories free erotic gang bang stories gang bang and porn cuckol... Read More

Texas holdem party poker free Sixth texas holdem party poker free. Read More

8 Comments

And, by forcing code to run in a virtual machine, Java ensures that any such bugs are less frequent by simple virtue of the whole system running slower.

More seriously: there are applications for which Java is a perfectly acceptable tool, such as client applications. For large, scalable servers, it is a rather inappropriate choice. I've been through two organizations that learned this lesson the hard way; in both cases, the only fix was a massive and expensive rewrite of gobs of server code.

Finally, I would argue that you could eliminate most C and C++ buffer overflow errors by ripping the inherently unsafe unbounded functions out of clib -- functions like sprintf, strcpy, strcat, etc. Sure it would be a real pain in the ass as you had to go through existing code and replace them with safe variants (like snprint, strncpy, strncat) -- largely because you'd have to figure out how large your destination buffer is to do so. But that's kind of the point in the first place.

Eric's point is exactly correct: such tools are motivated by crappy programming languages (Java's non-applicability to such-and-such notwithstanding). I don't know exactly which 19 error types Yoran refers to, but similar lists I've seen are composed primarily of issues which are eliminated in languages with a safe memory model. Many of the remaining issues could be swept under the rug of "better type checking" except that most programming languages are so inexpressive (e.g., lacking support for covarient parameters) that they require loopholes (i.e., casts or supertype genericity) that undermine type checking.

On the other hand, if a certain computer security researcher's study is anything to go by, then eliminating even the 95 percent of bugs that fall into the most common error types may not actually end up succeeding in making software less vulnerable to attack.

You forgot to include a url in that hyperlink, Dan. I'd like to see that study.

I suspect he's referring to my Is finding security holes a good idea?. That said, I'm not entirely sure that a reduction as large as 95% wouldn't make a significant dent in the overall rate of bug finding.

If we all moved over to "safe" languages, it would have almost no effect on hackability. Bad programmers write bad code. Bad code has bugs. Bugs can be exploited.

Yes, it is an outrage that buffer overflows and the like are a major source of current bugs. OTOH, do you really think that the idiots that create such code today wouldn't create other bugs if they were in a "safe" language?

Buffer overflows are common exploits because crackers aren't motivated to create tools to go after more interesting exploits. You might want to think carefully about what would happen if they did.

Well, there are two issues here:
1. Would it be more effective to write in "safe" languages than to have code analysis tools (the subject of my post).
2. Would "safe" languages lead to an overall increase in computer security?

As I argued in the original post, I believe that the answer to (1) is Yes. The kinds of bugs that we are currently able to detect with analytical tools are often forbidden "by construction" in safer languages. I don't know the answer to (2). I think it's an interesting research question with arguments in both directions.

Perhaps it would be necessary to reduce the number of exploits by a factor of, say, 1000 to make a practical difference in the cost of an intrusion. But it's impossible to get to a factor of 1000 without passing a factor of 20 first, so passing up on a factor of 20 is folly.

Having written reasonably secure widely deployed code in a safe language, I can tell you that there's a single thing which causes most of my security heebie-jeebies, and that's file system calls. File system calls take file identifiers which are opaque strings whose semantics are not cross platform and poorly documented, assuming they work right at all. The result is that opening any file which includes something received over the network in any part of its name contains considerable potential danger.

And I don't care what ridiculous claims Schneier makes, unicode has made the problem ten times worse.

Leave a comment