Geneticists Begin Tests of an Internet for DNA

Posted: December 18, 2014 at 3:44 pm

Scientists are starting to open their DNA databases online, creating a network that could pave the way for gene analysis at a new scale.

A coalition of geneticists and computer programmers calling itself the Global Alliance for Genomics and Health is developing protocols for exchanging DNA information across the Internet. The researchers hope their work could be as important to medical science as HTTP, the protocol created by Tim Berners-Lee in 1989, was to the Web.

One of the groups first demonstration projects is a simple search engine that combs through the DNA letters of thousands of human genomes stored at nine locations, including Googles server farms and the University of Leicester, in the U.K. According to the group, which includes key players in the Human Genome Project, the search engine is the start of a kind of Internet of DNA that may eventually link millions of genomes together.

The technologies being developed are application program interfaces, or APIs, that let different gene databases communicate. Pooling information could speed discoveries about what genes do and help doctors diagnose rare birth defects by matching children with suspected gene mutations to others who are known to have them.

The alliance was conceived two years ago at a meeting in New York of 50 scientists who were concerned that genome data was trapped in private databases, tied down by legal consent agreements with patients, limited by privacy rules, or jealously controlled by scientists to further their own scientific work. It styles itself after the World Wide Web Consortium, or W3C, a body that oversees standards for the Web.

Its creating the Internet language to exchange genetic information, says David Haussler, scientific director of the genome institute at the University of California, Santa Cruz, who is one of the groups leaders.

The group began releasing software this year. Its hopeas yet largely unrealizedis that any scientist will be able to ask questions about genome data possessed by other laboratories, without running afoul of technical barriers or privacy rules.

The researchers felt they had to act because the falling cost of decoding a genomethen about $10,000, and now already closer to $2,000was producing a flood of data they were not prepared for. They feared ending up like U.S. hospitals, with electronic systems that are mostly balkanized and unable to communicate.

The way genomic data is siloed is becoming a problem because geneticists need access to ever larger populations. They use DNA information from as many as 100,000 volunteers to search for genes related to schizophrenia, diabetes, and other common disease. Yet even these quantities of data are no longer seen as large enough to drive discovery. You are going to need millions of genomes, says David Altshuler, deputy director of the Broad Institute in Cambridge and chairman of the new organization. And no single database is that big.

The Global Alliance thinks the answer is a network that would open the various databases to limited digital searches by other scientists. Using that concept, says Heidi Rehm, a Harvard Medical School geneticist, the alliance is already working on linking together some of the worlds largest databases of information about the breast cancer genes BRCA1 and BRCA2, as well as nine currently isolated databases containing data about genes that cause rare childhood diseases.

More:
Geneticists Begin Tests of an Internet for DNA

Related Posts