New Software Analyzes Human Genomes Faster than Other Available Technologies, Empowering Population Scale Genomic …

Posted: January 31, 2015 at 4:43 am

Contact Information

Available for logged-in reporters only

Newswise Investigators at Nationwide Childrens Hospital have developed an analysis pipeline that slashes the time it takes to search a persons genome for disease-causing variations from weeks to hours. An article describing the ultra-fast, highly scalable software was published in the latest issue of Genome Biology (http://genomebiology.com/2015/16/1/6/abstract).

It took around 13 years and $3 billion to sequence the first human genome, says Peter White, PhD, principal investigator and director of the Biomedical Genomics Core at Nationwide Childrens and the studys senior author. Now, even the smallest research groups can complete genomic sequencing in a matter of days. However, once youve generated all that data, thats the point where many groups hit a wall. After a genome is sequenced, scientists are left with billions of data points to analyze before any truly useful information can be gleaned for use in research and clinical settings.

To overcome the challenges of analyzing that large amount of data, Dr. White and his team developed a computational pipeline called Churchill. By using novel computational techniques, Churchill allows efficient analysis of a whole genome sample in as little as 90 minutes.

Churchill fully automates the analytical process required to take raw sequence data through a series of complex and computationally intensive processes, ultimately producing a list of genetic variants ready for clinical interpretation and tertiary analysis, Dr. White explains. Each step in the process was optimized to significantly reduce analysis time, without sacrificing data integrity, resulting in an analysis method that is 100 percent reproducible.

The output of Churchill was validated using National Institute of Standards and Technology (NIST) benchmarks. In comparison with other computational pipelines, Churchill was shown to have the highest sensitivity at 99.7 percent; highest accuracy at 99.99 percent and the highest overall diagnostic effectiveness at 99.66 percent.

At Nationwide Childrens we have a strategic goal to introduce genomic medicine into multiple domains of pediatric research and healthcare. Rapid diagnosis of monogenic disease can be critical in newborns, so our initial focus was to create an analysis pipeline that was extremely fast, but didnt sacrifice clinical diagnostic standards of reproducibility and accuracy says Dr. White. Having achieved that, we discovered that a secondary benefit of Churchill was that it could be adapted for population scale genomic analysis.

By examining the computational resource use during the data analysis process, Dr. Whites team was able to demonstrate that Churchill was both highly efficient (>90 percent resource utilization) and scaled very effectively across many servers. Alternative approaches limit analysis to a single server and have resource utilization as low as 30 percent. This efficiency and capability to scale enables population-scale genomic analysis to be performed.

To demonstrate Churchills capability to perform population scale analysis, Dr. White and his team received an award from Amazon Web Services (AWS) in Education Research Grants program that enabled them to successfully analyze phase 1 of the raw data generated by the 1000 Genomes Project an international collaboration to produce an extensive public catalog of human genetic variation, representing multiple populations from around the globe. Using cloud-computing resources from AWS, Churchill was able to complete analysis of 1,088 whole genome samples in seven days and identified millions of new genetics variants.

View original post here:
New Software Analyzes Human Genomes Faster than Other Available Technologies, Empowering Population Scale Genomic ...

Related Posts