The DREAD Pirates

[no description provided]

Then he explained the name was important for inspiring the necessary fear. You see, no one would surrender to the Dread Pirate Westley.

The DREAD approach was created early in the security pushes at Microsoft as a way to prioritize issues. It's not a very good way, you see no one would surrender to the Bug Bar Pirate, Roberts. And so the approach keeps going, despite its many problems.

There are many properties one might want in a bug ranking system for internally found bugs. They include:

A cool name
A useful mnemonic
A reduction in argument about bugs
Consistency between raters
Alignment with intuition
Immutability of ratings: the bug is rated once, and then is unlikely to change
Alignment with post-launch/integration/ship rules

DREAD certainly meets the first of these, and perhaps the second two. And it was an early attempt at a multi-factor rating of bugs. But there are many problems which DREAD brings that newer approaches deal with.

The most problematic aspect of DREAD is that there's little consistency, especially in the middle. What counts as a 6 damage versus 7, or 6 versus 7 exploitability? Without calibration, different raters will not be consistent. Each of the scores can be mis-estimated, and there's a tendency to underestimate things like discoverability of bugs in your own product.

The second problem is that you set an arbitrary bar for fixes, for example, everything above a 6.5 gets fixed. That makes the distinction between a 6 and a 7 sometimes matters a lot. The score does not relate to what needs to get fixed when found externally.

This illustrates why Discoverability is an odd things to bring into the risk equation. You may have a discoverability of "1" on Monday, and 10 on Tuesday. ("Thanks, Full-Disclosure!") So something could have a 5.5 DREAD score because of low discoverability but require a critical update. Suddenly the DREAD score of the issue is mutable. So it's hard to use DREAD on an externally discovered bug, or one delivered via a bug bounty. So now you have two bug-ranking systems, and what do you do when they disagree? This happened to Microsoft repeatedly, and led to the switch to a bug bar approach.

Affected users is also odd: does an RCE in Flight Simulator matter less than one in Word? Perhaps in the grand scheme of things, but I hope the Flight Simulator team is fixing their RCEs.

Stepping beyond the problems internal to DREAD to DREAD within a software organization, it only measures half of what you need to measure. You need to measure both the security severity and the fix cost. Otherwise, you run the risk of finding something with a DREAD of 10, but it's a key feature (Office Macros), and so it escalates, and you don't fix it. There are other issues which are easy to fix (S3 bucket permissions), and so it doesn't matter if you thought discoverability was low. This is shared by other systems, but the focus on a crisp line in DREAD, everything above a 6.5 gets fixed, exacerbates the issue.

For all these reasons, with regards to DREAD? Fully skeptical, and I have been for over a decade. If you want to fix these things, the thing to do is not create confusion by saying "DREAD can also be a 1-3 system!", but to version and revise DREAD, for example, by releasing DREAD 2. I'm exploring a similar approach with DFDs.

I'm hopeful that this post can serve as a collection of reasons to not use DREAD v1, or properties that a new system should have. What'd I miss?

Originally published by Adam on 31 May 2018
Last updated on 01 Jun 2018
Categories: measurement metrics product management risk software engineering threat model thursday threat modeling