UC Santa Cruz to lead effort to build a new map of human genetic variation

Posted: January 13, 2015 at 4:47 pm

Simons Foundation awards up to $1 million to UC Santa Cruz Genomics Institute to develop a comprehensive Human Genome Variation Map for scientific and medical research

VIDEO:Benedict Paten discusses work on the Human Genome Variation Map at the University of California Santa Cruz Genomics Institute. view more

Researchers at the UC Santa Cruz Genomics Institute have received a grant for up to $1 million from the Simons Foundation to develop a comprehensive map of human genetic variation. The Human Genome Variation Map will be a critical new resource for both medical research and basic research in the life sciences.

The one-year pilot project aims to overcome the limitations of the current model for analyzing human genome data, which is based on the use of a single reference sequence for the human genome. Essentially, all novel sequencing data is analyzed by mapping new genome sequences to this one reference set of 24 human chromosomes to identify variants. But this approach leads to biases and mapping ambiguities, and some variants simply cannot be described with respect to the reference genome, according to David Haussler, professor of biomolecular engineering and director of the Genomics Institute at UC Santa Cruz.

"One exemplary human genome cannot represent humanity as a whole, and the scientific community has not been able to agree on a single precise method to refer to and represent human genome variants. There is a great deal we still don't know about human genetic variation because of these problems," said Haussler, who will lead the project with co-investigator Benedict Paten, a research scientist at the Genomics Institute.

According to Paten, the proliferation of different genomic databases has resulted in hundreds of specialized coordinate systems and nomenclatures for describing human genetic variation. UC Santa Cruz genomics researchers are intimately familiar with this "Tower of Babel" of databases through their work to display data from all these sources on the widely used UCSC Genome Browser. Launched in July 2000 shortly after UC Santa Cruz posted the first working draft of the human genome sequence on the Internet, the browser now serves 130,000 researchers around the world and gets more than 1 million web page requests per day.

"For now, all our browser staff can do is to serve the data from these disparate sources in their native, mutually incompatible formats," Paten said. "This lack of comprehensive integration, coupled with the over-simplicity of the reference model, seriously impedes progress in the science of genomics and its use in medicine."

Recently, with funding from the Simons Foundation, researchers David Reich and Nick Patterson at the Broad Institute of MIT and Harvard have amassed more than 300 complete human genome sequences representing a range of ethnicities. Haussler and Paten plan to use this set of human genomes, which they say is deeper and more completely organized than any prior human data set, to build a new graph-based human reference genome structure.

"This unique data set of genome diversity gives us an opportunity to define a comprehensive reference genome structure that can be truly representative of human variation. Eventually, we will want to expand it to include many more genomes, but this pilot project will focus on building a map structure based on the Reich-Patterson data set," Paten said.

The new Human Genome Variation Map will replace the current snarl of isolated, incompatible databases of human genetic variation with a single, fundamental representation formalized as a very large mathematical graph. The clean mathematical formulation is a major strength of this new approach, Paten said.

Visit link:
UC Santa Cruz to lead effort to build a new map of human genetic variation

Related Posts