Shostack + Friends Blog Archive

 

The "Privacy-Enhanced Data Mining" Trap

trap.jpgThe Associated Press pushed a story to the wires about the Data Surveillance [link to http://crcs.deas.harvard.edu/workshop/2006/ no longer works] workshop which I’d mentioned a while back:

As new disclosures mount about government surveillance programs, computer science researchers hope to wade into the fray by enabling data mining that also protects individual privacy.

Largely by employing the head-spinning principles of cryptography, the researchers say they can ensure that law enforcement, intelligence agencies and private companies can sift through huge databases without seeing names and identifying details in the records.

So let’s talk about that. The argument can be re-stated as “we can take data, sift it, and then start an investigation based on the sifted data, and go through the warrants process.”

This requires both willful ignorance of the quality of the data being mined, and a rose-tinted willingness to trust the justice system.

The quality of data in a privately data-mined system will be no greater than that in any other system, and will likely be lower. It will be lower because inaccurate data will not be visible for correction. Fair information practices such as accuracy and access are deeply impeded.

Once the data mining system has come out and said “Alice is a suspect,” Alice will enter into a Kafka-esque bureaucratic nightmare. The computer found something.
How “the computer found something” can translate into a warrant is system dependent. Some systems may unmask the “data.” Others may be presented to a judge as “the computer thinks we need to investigate this person.” Either way, Alice’s innocence will be viewed with suspicion. Either she’s really good at hiding her guilt, or we’ve caught a sleeper.

Research into ways in which data mining can occur in ways that are respectful of the fair information practices is useful and worthwhile. Today’s privacy-destroying impulses need to be brought into check by a Congress and Judiciary balancing the executive. (Of course, the legislatures are contributing, as documented in stories like “Police to Get Access to Student Data.” Thanks, Alice!) Giving them a set of tools is worthwhile, but we should be aware of the limits of the tools we have today.

Photo, Implement of Destruction by Canardo.

2 comments on "The "Privacy-Enhanced Data Mining" Trap"

  • Chris Walsh says:

    “I’m so old, I can remember when privacy researchers cared about *avoiding* government surveillance”
    (I’m telling ya, I don’t get no respect)

  • an anon privacy engineer says:

    Hi Chris,
    Perhaps the readers should also refer to “Translucent Databases” by Peter Wayner, which goes into some detail about “translucent cryptography” and suggests some technical approaches to normalize input into translucent DB searches and ways to mitigate privacy damage.
    an anon privacy engineer

Comments are closed.