Reports on Reporting, Compliance
A University of Washington researchers Kris Erickson and Philip Howard have an interesting new paper out, “A Case of Mistaken Identity? News Accounts of Hacker and Organizational Responsibility for Compromised Digital Records, 1980–2006.” This is a great survey of the dramatic explosion in reports of breaches. A couple of great quotes:
One important outcome of the legislation is improved information about the types of security breaches. Many of the news stories between 1984 and 2004 report palty details, with sources being off the record and vague estimates of the severity of the security breach. Since mandatory reporting legislation in many states, most news coverage provides more substantive details. In 2006, only 10 of the 257 news stories were unable to make some attribution of responsibility for a security breach. (Emphasis added.)
Even better, Erickson and Howard draw on the Attrition dataset (which I’ve been saying is important) and add to it, with their dataset. (500kb .xls)
In contrast is the data from the Symantec-backed “ITpolicycompliance.com.” This is work by Jim Hurley, so I was expecting a lot, but their report, “Taking Action to Protect Sensitive Data” [link to http://www.itpolicycompliance.com/research_reports/data_protection/read.asp?ID=9 no longer works] makes claims that I’m finding hard to believe. In particular, they claim that organizations suffering a publicly reported breach are losing 8% of their customers and 8% of their revenue. (Page marked “1,” Executive summary, under “financial impacts.”) Unfortunately, this number isn’t sourced or explained, and unlike the UW report, the underlying data isn’t being shared. Is that an 8% loss each? Is that median? Mean? What’s the variance? Are there outliers?
I’ve done some work recently, in the hopes of finding SEC filings that discuss these revenue losses. You’d figure 8% is, how do Messrs. Sarbanes and Oxley say? Material! I think that’s the word! A material impact on revenues. I’d think an 8% drop would justify some SEC filings. The thing is, having done some digging, I can’t find any. So, I’m skeptical.
I’m optimistic that in the future, we will be over our strange fears of talking about breaches, and we’ll be able to talk about our data in a more mature way.
Adam, thanks for picking up on the research! With the collective support of CSI, IIA, Protiviti, and Symantec (along with others in the pipeline planning to join), we’re really encouraged by the positive industry response to the site. We’re spending a lot of time and effort to develop primary research on what’s working and what’s not in the context of how organizations can meet their policy and regulatory compliance goals.
Regarding your questions about some of the numbers…The numbers for revenue losses and customer losses are part of a large study conducted with 254 organizations, completed at the end of 2006 (this was a sneak peek). The full report covering these findings is going to be released next quarter, after we collect enough data to obtain a larger sample.
The loss figure of 8% is the mean for the current sample involving 254 organizations. The standard deviation for losses across this sample is +/- 7%. This translates to losses ranging from 1% to as much as 15%. As the sample size increases, the standard deviation for losses will become smaller. However, based on past experience, we do not expect the mean for losses to differ substantially, which is why we included it in the report.
Stay tuned…more to come.
Jim,
Thanks for taking the time to comment!
On the deviation, I’m reading that to say that losses are reasonably evenly distributed, rather than clustered or on a bell curve? That sounds really strange to me, but, heck, data is allowed to be strange, and strange data leads to all sorts of good research questions. It would be great in your next report to provide more characterization of what you see.
I’m strongly in favor of deep research on what works and what doesn’t. We’ve spent long enough letting the pundits win.
Jim:
The mean might be a decent measure of central tendency, but w/out any info on the shape of the distribution it is hard to say. More importantly, and reading between the lines, it looks as though your small N is due to a low response rate to a survey. If so, you may well wind up with a biased estimate. What can you tell us about how your sample was chosen?