Amazon Web Services has created a database to help the public access information from NIH's 1000 Genomes Project, the New York Times' "Bits" reports. AWS is Amazon's cloud-computing arm.
Although the NIH data already are accessible to the public, Adam Selipsky -- vice president of AWS -- said, "Downloading this to your own servers could take weeks to a month, assuming you had the data storage."
To make the information more accessible, AWS is incurring the costs of storing the 200 terabytes of data. Researchers can pay AWS to conduct further processing and analysis of the information.
What the Database Contains
The database contains the entirety of NIH's genome survey, which includes genetic information from 1,700 individuals.
People involved in NIH's genome project consented to having their data available to the public. The genetic data do not include personal information, such as disease histories.
Lisa Brooks -- program director for the Genetic Variation Program of NIH's National Human Genome Research Institute -- said NIH's genome project is "the only public data set like this" and contains "an almost complete set of human genetic variants" (Hardy, "Bits," New York Times, 3/29).