Last.fm and the Diabolical Power of Data Mining | Electronic Frontier Foundation

Last.fm and the Diabolical Power of Data Mining | Electronic Frontier Foundation

Technical Analysis by Peter Eckersley

Recently, there was a minor scandal when TechCrunch accused Last.fm of turning over information — the identities of people listening to copies of a leaked U2 album — to the RIAA. Last.fm issued a scathing denial of these allegations, and it’s good to hear that the site hasn’t turned into a worldwide music surveillance system. Not on purpose, that is.

Last.fm’s avowed innocence isn’t quite the end of the story. The whole kerfuffle should remind us that websites that collect and republish seemingly innocuous facts about their users are often vulnerable to data mining. It doesn’t matter whether you keep the users’ names and addresses secret — the facts you publish about them may be sufficient to ensure that there is only one person on the whole wide web to whom those facts pertain.1

This isn’t a problem that’s unique to Last.fm in any way. Networked computer systems often leak secrets in unexpected ways, but Last.fm serves as a particularly clear example of why anonymity is hard to achieve.

More on this risk, and what to do about it, after the jump.