Whole Genome Analysis, STAT

Contact Information

Available for logged-in reporters only

Newswise Although the time and cost of sequencing an entire human genome has plummeted, analyzing the resulting three billion base pairs of genetic information from a single genome can take many months.

In the journal Bioinformatics, however, a University of Chicago-based teamworking with Beagle, one of the worlds fastest supercomputers devoted to life sciencesreports that genome analysis can be radically accelerated. This computer, based at Argonne National Laboratory, is able to analyze 240 full genomes in about two days.

This is a resource that can change patient management and, over time, add depth to our understanding of the genetic causes of risk and disease, said study author Elizabeth McNally, MD, PhD, the A. J. Carlson Professor of Medicine and Human Genetics and director of the Cardiovascular Genetics Clinic at the University of Chicago Medicine.

The supercomputer can process many genomes simultaneously rather than one at a time, said first author Megan Puckelwartz, a graduate student in McNallys laboratory. It converts whole genome sequencing, which has primarily been used as a research tool, into something that is immediately valuable for patient care.

Because the genome is so vast, those involved in clinical genetics have turned to exome sequencing, which focuses on the two percent or less of the genome that codes for proteins. This approach is often useful. An estimated 85 percent of disease-causing mutations are located in coding regions. But the rest, about 15 percent of clinically significant mutations, come from non-coding regions, once referred to as junk DNA but now known to serve important functions. If not for the tremendous data-processing challenges of analysis, whole genome sequencing would be the method of choice.

To test the system, McNallys team used raw sequencing data from 61 human genomes and analyzed that data on Beagle. They used publicly available software packages and one quarter of the computers total capacity. They found that shifting to the supercomputer environment improved accuracy and dramatically accelerated speed.

Improving analysis through both speed and accuracy reduces the price per genome, McNally said. With this approach, the price for analyzing an entire genome is less than the cost of the looking at just a fraction of genome. New technology promises to bring the costs of sequencing down to around $1,000 per genome. Our goal is get the cost of analysis down into that range.

This work vividly demonstrates the benefits of dedicating a powerful supercomputer resource to biomedical research, said co-author Ian Foster, director of the Computation Institute and Arthur Holly Compton Distinguished Service Professor of Computer Science. The methods developed here will be instrumental in relieving the data analysis bottleneck that researchers face as genetic sequencing grows cheaper and faster.

Link:

Whole Genome Analysis, STAT

Related Posts

Comments are closed.