Code analysis and safe languages
Ekr
writes:
These tools aren’t perfect and it certainly would be nice to have better tooling, but it’s worth noting that a lot of the bugs they find are the kind of thing that could be entirely eliminated if people would just program in safer languages. For instance, the buffer overflow vulnerabilities which have been so troublesome to eliminate in C/C++ code are basically a non-problem with Java.
Type safety is useful, but its not a complete fix. A type-safe language doesn’t help you avoid issues of tainted variables being abused. It may or may not help you deal with integer underflows. It certainly allows you to spend more time fixing these issues by reducing contention for bug-fixing time.
But the biggest issue with software security is that it’s hard to measure, even when you’re producing the software. It’s very hard for customers to even assess. To turn that assessment into a measurement by quantifying it is like rocket science. (Really, rockets are easier, everyone sees when they blow up, and customers are rarely blamed for it.) Doug Barnes, in his paper on Deworming the Internet, analyzes the issue of customers ignoring security until they’re locked into a standard. There are also signaling issues, but there’s no good signals in security.
So assessments of security are anecdotal and emotional. Microsoft sent a message by stopping development to fix bugs for a month. Few others have the economic muscle to do the same. (I’ve been told Oracle did, but haven’t been able to find out details. Also, Oracle seems focused on process-oriented certifications, rather than bug-stomping.)
Given the non-quantified nature of security, an easy choice for a manager is to delay finding or fixing security bugs. If Alice is spending an hour a day running Splint and MOPS, then when we need more productivity (read, features) from Alice, we abandon the tools. So again, a good language choice up front helps. What language that should be is project dependent.
On a broader scale, we need to move away from process (“Did you write a functional spec?”) to evidence that your process results in code with relatively few security bugs. We need customers to demand the evidence, rather than accepting salesman promises.
So do you think this can be modeled using a version of the El Farol’s Bar you post about in the future? Maybe we can optimize the number of acceptable bugs… How does/should the policies of Microsoft and Oracle affect this model?
What are some other metrics that are meaningful? I don’t know of any literature out there that differentiates btwn security and non-security related bugs.
There is someone doing some work on predicting defect density by injecting defects into the code, identifying how many are found by the QA group and extrapolating across all bugs.