Shostack + Friends Blog Archive

Estimating breach size by fraud volume

Much is being made of a press release from ID Analytics. Based on results from that firm’s fraud detection products, a conservative estimate is that one of every 1000 pieces of PII lost in a data breach results in an actual fraud. An additional finding is that the likelihood of a fraud being committed using a given piece of revealed PII is inversely proportional to the size of the breach.
These results are being spun as suggesting that large breaches are not so bad, and that the “real risk” of ID theft is low.
Well, I won’t comment on that, but the credence afforded the ID Analytics numbers cuts both ways. For example, if they are right, than the Sam’s Cub breach exposed the information of about 600,000 people.

Originally published by cwalsh on 9 Dec 2005
Last modified on 9 Dec 2005
Categories: breaches Uncategorized

3 comments on "Estimating breach size by fraud volume"

Adam says:

9 Dec 2005 at 3:10 pm

This spin around this one is awful. Thanks for clarifying what’s actually being said, it now makes a lot more sense.
Pete says:

10 Dec 2005 at 10:20 pm

Doesn’t your Sam’s Club math fly in the face of the inverse proportionality claim? Also, they are pretty clear about the high variance among incident types.
(Btw, even if your math were accurate, the other 599,400 would be safe – by definition).
Chris Walsh says:

11 Dec 2005 at 11:12 am

@Pete:
Quoting the press release:
“[T]he calculated fraudulent misuse rate for consumer victims of the analyzed breach with the highest rate of misuse was 0.098 percent—less than one in 1,000 identities.”
So, the 1 in 1K figure is what I use since I want to minimize my estimate of the size of the Sam’s Club breach. Since we know that 600 frauds were committed, a conservative estimate of the number of payment card numbers stolen is 600K.
You are right that fraud likelihood was found to vary according to the type of data stolen, and that this is emphasized. I didn’t give that much attention, since it is no surprise to me that my name+SSN are a more useful (to criminals!) combo than my name and CC#. This fact seems to be more or less accurately depicted in the media, whereas what is happening with the “1 in 1000” thing is that it is being spun as a “Nothing to worry about, move along” finding. The ID Analytics people, for example, put the following in bold face in the headline on the PR page:
“Rate of Misuse of Breached Identities May be Lower than Anticipated”.
The inverse proportionality claim I make also comes straight from the release, which says in italics “Evidence Suggests that the Smaller the Data Breach, the Higher a Consumer’s Risk”. This is also common sense — if a crook is picking victims’ names from a bag, it’s better from the potential victim’s point of view, to be in a really big bag.
In short, my math is fine, as is my deductive logic. :^)
Plenty could be said about the ID Analytics study, but it is unfair to hold a vendor’s marketing department to the same standards one would hold a real researcher.

Comments are closed.