Vulnerability Game Theory
So a few days ago, I attended the Vista RTM party. I spent time hanging out with some of the pen testers, and they were surprised that no one had dropped 0day on us yet. These folks did a great job, but we all know that software is never perfect, and that there are things we missed. I hope that the defense in depth tools (/gs, safeseh, ASLR, UAC) help control the customer impact.
So, that said, I’d like to think about this from the researcher point of view. If you’re a clever researcher who’s finding Vista issues, what do you do with them? I think there are three different answers.
First, if you have one, you publish it immediately. Ideally, you do that in a responsible way, but you don’t want to risk your one vuln being found independently and fixed.
Next, if you have a few vulns, you sit on them all, and try to measure the independent find rate, so you know how long they last. When you have that estimate, you decide what to do with what’s left.
Finally, if you have a lot of vulns, and are hoping to sell them, you drop 0day on us as a marketing and advertising ploy. Whoever releases the first working exploit against Vista is going to bring themselves a lot of notoriety, and bring our customers a lot of pain. It’s sorta cool that no one’s done this yet. Maybe they’re waiting on the release to business or consumers? That’s an interesting gamble–you’ll get more attention, but you’re also making a bet that you expect no one will take the “first vuln” credit between now and then. So the longer it takes, the larger the implied compliment on waiting: It’s hard to find vulns, and I expect to be able to wait.
Implied compliments aren’t all that interesting. Someone will have the first issue.
What matters isn’t the first day, it’s the first year. I think we’re pleased with the work done, know that it’s never-ending, and are optimistic that Vista’s first year is going to look substantially better than XP’s first year. That’s the first real test: do we see fewer vulns, and are the vulns of lower average severity? The second real test is what happens to real customer impacts? That’s the test that matters most, and is far harder to measure.