Breach Tidbit
One of the things people would like to find out is how likely it is that improperly-revealed personal information will be used to commit real fraud.
ID Analytics has done some research which they interpret as suggesting that even with focused attacks, where the bad guy is going after SSN and account information, the probability of illicitly-gained PII being used for actual fraud is less than 1 in 1000.
In looking over some information I received from New York, I noticed a case in which branded credit card applications (including the assigned CC#) were targeted, and 150 stolen. Now, I don’t know if the case I’m talking about is like those in the “targeted” group studied by ID Analytics, but if it is, I’d expect maybe one fraud attempt, and that’s being extremely generous.
The number actually observed: 11, all fraudulent purchases.
There is a lot of work left to be done on this topic, that’s all I’m saying.
Updated: Link to press release, and characterization of observed fraud.
Do you have a link for that ID analytics work?
Also, what sort of fraud took place? Account abuse, or new account openings?
This sounds like a “regression to the mean” issue. It may also support the notion that you are better off being a part of a larger group than a smaller one, under the assumption that there is some limited number of cards that can be used by a few individuals over a short time span.
Pete– ain’t it great to have data?
Vijayan’s discussion of various analytic studies [http://www.computerworld.com/action/article.do?command=viewArticleBasic&articleId=9003343] includes more recent studies that agree that less than 1% of data loss may result in ID theft. But pose the question somewhat differently and you get “data breaches were responsible for just 6% of all known cases of identity theft” and “.. 18% of identity theft victims attributed the cause to computer breaches.” Or you find statements like, “Other causes include data stolen by friends, acquaintances, relatives or corrupt employees.” — as if theft by friends, families, or employees isn’t still data theft.
The sampling issues are just huge. Being somewhat cynical, I would guess that asking businesses if they’ve been breached and whether it resulted in financial fraud may not be the best way to sample or to get an accurate picture of the true cost of breaches or data loss. Financial fraud may worry businesses, but the cost to individuals is far greater, albeit somewhat difficult to calculate. Saying that it’s not that many individuals affected is tantamount to saying “Well, we screwed up and exposed 3,000,000 people to anthrax, but only 30,000 got ill and nobody died.” Uh huh.
Dissent–are these the same studies that say 50% of people don’t know how the fraudster got their data?
The 50% estimate was from the FTC study several years ago, if I remember correctly. Have you seen that estimate in any recent report? Being somewhat loathe to take food out of my kids’ mouths to buy the full research reports from Javelin and others, I’m stuck just trying to piece things together from press releases and analyses by others.
But even if it were true that less than 1% of large corporate breaches or losses result in ID theft or financial fraud — and I don’t know if that’s even accurate because so many breaches still don’t get reported — that doesn’t make them any less important for other reasons. I’m more oriented to the political/psychological aspects of privacy, and to me, these monster databases and breaches are very worrying even for nonfinancial reasons.
As an fyi, I’m told that the FTC is planning to release an updated version of their 2003 incidence of identity theft study sometime in early November to coincide with their “Protecting Consumers in the Next Tech-ade” hearings at George Washington University (November 6-9).
I have reservations about the reliability of the ID Analytics study. My question is how long after the breach incidents was the study done. No criminal with any sense would try to use personally identifiable information gathered through a breach to commit identity theft right away. You would be much better off ‘banking’ the information for at least a year or two to avoid credit monitoring (sometimes offered to victims by the entity suffering the breach) and increased vigilance by the consumer. You would need to do the same study at least 18 months out from the time of the breach.
I’m glad to see that good questions are being asked about the analysis. Unfortunately, while the NY Times article was very accurate, it was too short to answer many of the questions posted in your blog. Let me try to provide some more background that may answer your questions:
1) We did the study over a long period of time. The breaches we analyzed all happened 18-24 months ago, and were various kinds of breach types. We define breach types as either “Identity” (Identity elements taken including SSN) or “Account” level (Account numbers, CVV, expiration dates, etc … but no SSN). We further define them by intent, meaning how the data was taken. Those types are: Accidental, Incidental (information was on a laptop or storage device that was likely stolen for the storage device and not the information) and 3) Targeted (the worse kind). We looked at over 500,000 breached consumer identities and compared them to over 500mm risk events that we store in our ID Network. Those risk events can be change of address, applications for credit or wireless service or payments.
2) We analyze for ID Theft related to opening of new credit or services in another person’s name. We do not look at account fraud.
3) The .098% number (less than 1 in 1,000) ties to a targeted identity breach of approximately 200,000 for a one year period. Over time, that number would likely increase (unless the identities were recovered before they hit the Internet). We did not see any misuse in any of the other breaches we have analyzed.
4) The information posted about the 11 frauds out of 150 does not conflict with the number above, and I’m not surprised by it (actually, I thought that was low). Pete’s post is right, the smaller the breach the higher likelihood that the consumers’ name will be misused. For instance, I’d prefer to be a part of the VA breach, vs. have a thief steal my mail out of my post box. In that “breach of one”, I would assume that the likelihood of my identity being misused would be very close to 100%. (Actually, I’d prefer not to be part of any breach, but for the sake of argument …)
5) Data stolen by friends and family is generally called “familiar” or “family fraud”. It seems like it would be considered data breach, but historically it has not been. I was at a conference last week, and it was suggested by a colleague of mine that social networking sites like MySpace and FaceBook might actually create additional “familiar fraud” because social networks extend and publicize your family and friends network. I thought that was an interesting comment, and will likely prove true.
6) Adam’s right. It is great to have real data to analyze real problems.
We will continue to do analysis as we have the ability to work with more breached data, and will continue to update our findings. The main point we want everyone to understand is that: All breaches are bad, BUT they are all very different and their levels of risk to a specific consumer differ depending 1) what was taken, 2) how the data was taken, and 3) how much data was taken.
Thanks for that, Mike.
I had a response to Pete’s comment prepared, and also used the example of a “breach of one”. Intuitively, it makes sense — if somebody has a bag of identities to steal from, it’s better (if your name is in it) for it to be a biiiig bag. However, having an intuition about something and showing that that intuition withstands empirical testing are two different things. I think we need to do more research and analysis, so I am very eager to see how these priors many of us hold stand up when facts are available with which to test them.
It may be that there is a richer picture to be drawn. For example, perhaps the profit motive in the bigger targeted breaches is to have a large group from which to cherry-pick. As long as the value of the relatively rare (say, 1/1000) nuggets is high enough, people will be willing to do dangerous things that impose externalities on the rest of us to get to them. No offense meant to the ore extraction biz, but I think the mining industry illustrates my metaphor pretty well.
Perhaps it’d be possible for you to make the full report available to Emergent Chaos readers? I’m sure many are interested in your analysis.