DataLossDB announces awesome new feature
The Data Loss Database [http://datalossdb.org/ no longer works], run by the Open Security Foundation [http://opensecurityfoundation.org/ no longer works], now has a significant new feature: the inclusion of scanned primary source documents [link to http://datalossdb.org/primary_sources no longer works].
This means that in addition to being able to determine “the numbers” on an incident, one can also see the exact notification letter used, the reporting form submitted to state government, cover letters directed at (for example) an attorney-general, and the like. Importantly, all the documents have been OCRed, making it possible to search within them.
There are currently several hundred documents in the archive, most of which arrived in the last few days. In order to link the docs to existing breach records quickly, the folks at DataLossDB latched onto a key insight: this is an embarrassingly parallelizable problem. Therefore, a screen is provided [link to http://datalossdb.org/index/random_ps no longer works] to do a bit of matching of scanned docs to existing breach entries. For those without research assistants, crowdsourced data entry is the way to go :^).
If you’re the type of person who is into the details of breaches — and who isn’t? — you should check this out.
Full disclosure: I contributed many of the documents in the archive, and am extremely pleased at what has come of this. The DataLossDB interface is vastly superior to even the vaporware version of my site.
Awesomeness! Can we help get the XML feed going too?