Researchers Tap Online Genetic Databases To Uncover Identities

Publicly available genetic information found online can be used to identify individuals, according to a study published Thursday in the journal Science, the New York Times reports.

The findings have prompted concerns about the privacy of genome databases.

How Individuals Were Identified

For the study, Yaniv Erlich and colleagues at the Whitehead Institute for Biomedical Research examined data from the 1000 Genomes Project, an international study that is collecting genetic information and posting it online for research purposes.

The researchers used the genome project to examine short tandem repeats of DNA sequences on the Y-chromosome of men.

Such repeated DNA patterns are inherited and can be used to determine ancestry. Many genealogy websites use the patterns to identify men by their surnames and find the surname of men with the same DNA pattern.

Once the researchers pinpointed the short repeated DNA sequences, they searched through online genealogy websites for a match.

By combining the surnames found through the genealogy websites with the ages and locations provided by the genome project, they were able to narrow their results and identify not only the individual, but many members of the individual's extended family (Kolata, New York Times, 1/17).

Each complete identification took three to seven hours, according to the study (Gever, MedPage Today, 1/17).

The researchers predicted they could use the genetic and census data to correctly identify the surnames of white, middle- and higher-income U.S. males 12% of the time.

According to the Los Angeles Times, the study does not outline a "complete recipe" that others could follow to identify individuals using genomic data, and it does not reveal the names of the individuals identified (Brown, Los Angeles Times, 1/18).

Privacy Concerns Raised

The researchers shared their findings with NIH prior to the study's publication. After seeing the findings, NIH's National Institute for General Medical Sciences removed age information from its publicly available genomic databases and moved the data to a controlled site.

In an editorial accompanying the Science study, Eric Green, director of the NIH's National Human Genome Research Institute, and colleagues wrote that the study "highlights vulnerabilities in efforts to protect the privacy of participants in genomics ... research" (MedPage Today, 1/17).

Amy McGuire -- a study author and an attorney and ethicist at Baylor College of Medicine -- said, "To have the illusion you can fully protect privacy or make data anonymous is no longer a sustainable position" (New York Times, 1/17).


to share your thoughts on this article.