Cory Doctorow: “The European Parliament is currently involved in a wrangle over the new General Data Protection Regulation. At stake are the future rules for online privacy, data mining, big data, governmental spying (by proxy), to name a few. Hundreds of amendments and proposals are on the table, including some that speak of relaxing the rules on sharing data that has been “anonymised” (had identifying information removed) or “pseudonymised” (had identifiers replaced with pseudonyms). This is, however, a very difficult business, with researchers showing how relatively simple techniques can be used to re-identify the data in large anonymised data sets, by picking out the elements of each record that make them unique. For example, a recent paper in Nature Scientific Reports showed how the “anonymised” data from a European phone company could be re-identified with 95% accuracy, given only four points of data about each person. To those who say that privacy is dead anyway, I would point out that the reason anonymisation and pseudonymisation are being contemplated in the proposed Regulation is because its authors say doing this will protect privacy – and that means that they’re implying privacy is worth preserving. Indeed, the whole premise of “Big Data” is at odds with the idea that data can be anonymised. After all, Big Data promises that with very large data-sets, subtle relationships can be teased out.”