Wired: “In 2013, a young computational biologist named Yaniv Erlich shocked the research world by showing it was possible to unmask the identities of people listed in anonymous genetic databases using only an Internet connection. Policymakers responded by restricting access to pools of anonymized biomedical genetic data. An NIH official said at the time, “The chances of this happening for most people are small, but they’re not zero.” Fast-forward five years and the amount of DNA information housed in digital data stores has exploded, with no signs of slowing down. Consumer companies like 23andMe and Ancestry have so far created genetic profiles for more than 12 million people, according to recent industry estimates. Customers who download their own information can then choose to add it to public genealogy websites like GEDmatch, which gained national notoriety earlier this year for its role in leading police to a suspect in the Golden State Killer case. Those interlocking family trees, connecting people through bits of DNA, have now grown so big that they can be used to find more than half the US population. In fact, according to new research led by Erlich, published today in Science, more than 60 percent of Americans with European ancestry can be identified through their DNA using open genetic genealogy databases, regardless of whether they’ve ever sent in a spit kit.
See also related articles on this topic:
- Heather Murphy, How an Unlikely Family History Website Transformed Cold Case Investigations, N.Y. Times (Oct. 15, 2018)
- Heather Murphy, Most White Americans’ DNA Can Be Identified Through Genealogy Databases, N.Y. Times (Oct. 11, 2018)
- Yaniv Erlich et al., Identity inference of genomic data using long-range familial searches, Science (Oct. 11, 2018)