June 30, 2017
Uncovering rare susceptibility variants that contribute to the causes of complex diseases requires large sample sizes and massively parallel sequencing technologies. These sample sizes, often made up of exome and genome data from tens to hundreds of thousands of individuals, are often too large for current analytical tools to process. A team at Baylor College of Medicine, led by Dr. Suzanne Leal, professor of molecular and human genetics, has developed new software called SEQSpark to overcome this processing obstacle. A study on the new technology appears in The American Journal of Human Genetics.
"To handle these large data sets, we built the SEQSpark tool based on the commonly used Spark program, which allows SEQSpark to utilize multiple processing platforms to increase the speed and efficiency of performing data quality control, annotation and rare variant association analysis," Leal said.
To test and validate the versatility and speed of SEQSpark, Leal and her team analyzed benchmarks from the whole genome sequence data from the UK10K, testing specifically for waist-to-hip ratios.
"The analysis and related tasks took about one and a half hours to complete, in total. This includes loading the data, annotation, principal components analysis and single and rare variant aggregate association analysis for the more than 9 million variants present in this sample set," explained Di Zhang, a postdoctoral associate in the Leal lab at Baylor and first author on the paper.
To evaluate SEQSpark's performance in a larger data set, Leal and the research team generated 50,000 simulated exomes. The SEQSprak program ran the analysis for a quantitative trait using several variant aggregate association methods in an hour and forty-five minutes.
When compared to other variant association tools, SEQSpark was consistently faster, reducing computation to a hundredth of the time in some cases.
"What is unique about SEQSpark is that it is scalable, and smaller labs can run it without super specific hardware, and it can also be run in a multi-server environment to increase its speed and capacity for large genetic data sets," Zhang said. "It is ideal for large-scale genetic epidemiological studies and is highly efficient from a computational standpoint."
"We see this software as being very useful as the demand for the analysis of massively parallel sequence data grows. SEQSpark is highly versatile, and as we analyze increasingly large sets of rare variant data, it has the potential to play a key role in furthering personalized medicine," Leal said.
In the future, Leal and her team will continue to test and increase SEQSpark's capabilities and will be analyzing soon data sets that have 500,000 samples or more.
Explore further: Genetic test for familial data improves detection genes causing complex diseases such as Alzheimer's
More information: Di Zhang et al. SEQSpark: A Complete Analysis Tool for Large-Scale Rare Variant Association Studies using Whole-Genome and Exome Sequence Data, The American Journal of Human Genetics (2017). DOI: 10.1016/j.ajhg.2017.05.017
A team of researchers at Baylor College of Medicine has developed a family-based association test that improves the detection in families of rare disease-causing variants of genes involved in complex conditions such as Alzheimer's. ...
Precision medicine, which utilizes genetic and molecular techniques to individually tailor treatments and preventative measures for chronic diseases, has become a major national project, with President Obama launching the ...
A multi-institutional team of researchers has sequenced the DNA of 6,700 exomes, the portion of the genome that contains protein-coding genes, as part of the National Heart, Lung and Blood Institute (NHLBI)-funded Exome Sequencing ...
(Medical Xpress)Via genetic analysis, a large international team of researchers has found rare, damaging gene variants that they believe contribute to the risk of a person developing schizophrenia. In their paper published ...
Human genome sequencing costs have dropped precipitously over the last few years, however the analytical ability to meet the growing demand for making sense of large data sets remains as a bottleneck. With the introduction ...
Researchers at EMBL-EBI have developed a new approach to studying the effect of multiple genetic variations on different traits. The new algorithm, published in Nature Methods, makes it possible to perform genetic analysis ...
Following up on findings from a an earlier genome-wide association study (GWAS) of type 2 diabetes (T2D) in Latinos, researchers from the Broad Institute of MIT and Harvard and Massachusetts General Hospital (MGH) traced ...
Although the basic outlines of human hearing have been known for years - sensory cells in the inner ear turn sound waves into the electrical signals that the brain understands as sound - the molecular details have remained ...
Using a new skin cell model, researchers have overcome a barrier that previously prevented the study of living tissue from people at risk for early heart disease and stroke. This research could lead to a new understanding ...
The first results from a functional genetic catalogue of the laboratory mouse has been shared with the biomedical research community, revealing new insights into a range of rare diseases and the possibility of accelerating ...
Whole genome sequencing involves the analysis of all three billion pairs of letters in an individual's DNA and has been hailed as a technology that will usher in a new era of predicting and preventing disease. However, the ...
Researchers have found that genes for coronary heart disease (CAD) also influence reproduction, so in order to reproduce successfully, the genes for heart disease will also be inherited.
Please sign in to add a comment. Registration is free, and takes less than a minute. Read more
Read more:
Researchers build SEQSpark to analyze massive genetic data sets - Medical Xpress
- IOM not webcast today. Why Not? [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- National Academies skeptical at Best. [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Some Confusion Exists [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Why DTC Genomics IS Medicine. [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- First Mari, Now Linda. Who's next? [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Is it true? [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Re-Reviewing the National Academies [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- The problem with nonclinicians....... [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Crazy Night of Emails to Government [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Adrienne Carlson's Personalized Medicine. [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Tell Me, How do you feel now? Sherpa's RX [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- This Just In. 23andMe to go to GPs. I love my readers!! [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Sorry so long away [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- 2D6 Rears its ugly head..... [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Ok, Fine, Back to Plavix [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Kaiser a protoype for Collins' Aim [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- A few months late to the party.... [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Stated Another Way....... [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Excuse Me? Harvard and Navigenics? WTF? [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Follow up to Yesterday's WTF? Harvard, Navi? and Pfizer??? [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Did you get your kit? Thanks Dr. Rob from MedCo [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Gluco...Wha? Parkinson's Disease and Glucocerebrosidase mutations. [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Away and now back, What did I miss???? 23andme layoffs? Selling Genomes for cheap up next! [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Change IS Needed. I agree with William, sometimes. [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Good Enough Science? Apparently so at 23andme [Last Updated On: November 8th, 2009] [Originally Added On: November 8th, 2009]
- Long QT Syndrome, location matters [Last Updated On: December 13th, 2009] [Originally Added On: December 13th, 2009]
- Congratulations Generation Health. Nice pick up! [Last Updated On: December 13th, 2009] [Originally Added On: December 13th, 2009]
- An argument 23andSerge can't win...23andme but not medicine [Last Updated On: December 13th, 2009] [Originally Added On: December 13th, 2009]
- Stop. Breathe. Repeat. An analysis of the direction of DTC Genomics Field. [Last Updated On: December 13th, 2009] [Originally Added On: December 13th, 2009]
- Hey DTC genomics, Stay Private, Stay Alive, Go Public and Die [Last Updated On: December 13th, 2009] [Originally Added On: December 13th, 2009]
- You can't have it both way. Either scared your genome is sold off or not. [Last Updated On: December 13th, 2009] [Originally Added On: December 13th, 2009]
- 15 Days Away Gives Time for Perspective. [Last Updated On: December 13th, 2009] [Originally Added On: December 13th, 2009]
- What about the SACGHS registry? Another missed opportunity? [Last Updated On: December 13th, 2009] [Originally Added On: December 13th, 2009]
- AJHG is in and my Favorite Muin is in it! But He Is NOT the Father! [Last Updated On: December 13th, 2009] [Originally Added On: December 13th, 2009]
- Navigenics for 23andMe prices? [Last Updated On: December 18th, 2009] [Originally Added On: December 18th, 2009]
- Lp(a) Maybe there's something there that wasn't there before? [Last Updated On: December 24th, 2009] [Originally Added On: December 24th, 2009]
- Another Year, Another Bankruptcy [Last Updated On: December 31st, 2009] [Originally Added On: December 31st, 2009]
- 5 Technologies going bye bye in this decade? [Last Updated On: January 6th, 2010] [Originally Added On: January 6th, 2010]
- Hackers, HITECH and HIPAA in DTC Genomics, Oh My! [Last Updated On: January 7th, 2010] [Originally Added On: January 7th, 2010]
- Personal Genomics Flop.....big Belly Flop! [Last Updated On: January 8th, 2010] [Originally Added On: January 8th, 2010]
- Gotta Love It. Even the daycare....... [Last Updated On: January 11th, 2010] [Originally Added On: January 11th, 2010]
- Congratulations Navigenics. You ARE a clinical lab! Uh-Oh... [Last Updated On: January 12th, 2010] [Originally Added On: January 12th, 2010]
- CETP, Jewish Centenarians and Alzheimers [Last Updated On: January 14th, 2010] [Originally Added On: January 14th, 2010]
- Enter the "Not" DTC Genomics Rep [Last Updated On: January 17th, 2010] [Originally Added On: January 17th, 2010]
- Why Dr. Vanier's Navigenics appointment is good for PM [Last Updated On: January 22nd, 2010] [Originally Added On: January 22nd, 2010]
- Holy Crap! MedCo Follows in CVS footsteps [Last Updated On: February 3rd, 2010] [Originally Added On: February 3rd, 2010]
- FDA, Warfarin, still not as sexy to me. [Last Updated On: February 5th, 2010] [Originally Added On: February 5th, 2010]
- Hype, Hype, Hype from a single study. [Last Updated On: February 11th, 2010] [Originally Added On: February 11th, 2010]
- I love my readers, even Renata M! [Last Updated On: February 17th, 2010] [Originally Added On: February 17th, 2010]
- How can insurers use DTC genomics to profile? [Last Updated On: February 17th, 2010] [Originally Added On: February 17th, 2010]
- 9p21.....ahem. Paynter et.al. Smackdown. Again. [Last Updated On: February 18th, 2010] [Originally Added On: February 18th, 2010]
- Hey! It's Pete Hulick! Are you Going to GET? [Last Updated On: February 19th, 2010] [Originally Added On: February 19th, 2010]
- I was wrong......AHEM [Last Updated On: February 28th, 2010] [Originally Added On: February 28th, 2010]
- G2C2, finally a tool for genomic education! [Last Updated On: March 2nd, 2010] [Originally Added On: March 2nd, 2010]
- Just 4 million? What 23andMe is worth. [Last Updated On: March 5th, 2010] [Originally Added On: March 5th, 2010]
- What a difference a year makes [Last Updated On: March 9th, 2010] [Originally Added On: March 9th, 2010]
- ........DTC Genomic Medicine? [Last Updated On: March 12th, 2010] [Originally Added On: March 12th, 2010]
- The FDA, 2c19 and the ACC [Last Updated On: March 13th, 2010] [Originally Added On: March 13th, 2010]
- The problem with Comparative Whole Genomics...... [Last Updated On: March 13th, 2010] [Originally Added On: March 13th, 2010]
- BRCA testing by 23andME is the same as Myriad Genetics. [Last Updated On: March 15th, 2010] [Originally Added On: March 15th, 2010]
- The Argument Against DTC Genomics Marketing and such [Last Updated On: March 16th, 2010] [Originally Added On: March 16th, 2010]
- A moment of Clarity. Some DTCG is not bad. [Last Updated On: March 18th, 2010] [Originally Added On: March 18th, 2010]
- SNPs for breast cancer risk? It Depends. [Last Updated On: March 18th, 2010] [Originally Added On: March 18th, 2010]
- How can MDVIP use Navigenics Test for Medicine? [Last Updated On: March 18th, 2010] [Originally Added On: March 18th, 2010]
- Why did P&G invest in Navigenics? [Last Updated On: March 23rd, 2010] [Originally Added On: March 23rd, 2010]
- PGx in DTCG? Doesn't stand up to Useful testing. [Last Updated On: March 25th, 2010] [Originally Added On: March 25th, 2010]
- End of Gene Patents? [Last Updated On: March 29th, 2010] [Originally Added On: March 29th, 2010]
- Sherpa Accepting Chief Medical Officership [Last Updated On: April 3rd, 2010] [Originally Added On: April 3rd, 2010]
- The Rumors of My Death........ [Last Updated On: April 20th, 2010] [Originally Added On: April 20th, 2010]
- Happy DNA Day! [Last Updated On: April 25th, 2010] [Originally Added On: April 25th, 2010]
- 99 USD, DNA day and patient letters [Last Updated On: April 25th, 2010] [Originally Added On: April 25th, 2010]
- 2C19, Navigenics and Clinical Reality. [Last Updated On: May 1st, 2010] [Originally Added On: May 1st, 2010]
- Coriell Personalized Medicine Collaborative rising [Last Updated On: May 7th, 2010] [Originally Added On: May 7th, 2010]
- Personal Genomes in Clinical Care. Quake paper is a waste! [Last Updated On: May 11th, 2010] [Originally Added On: May 11th, 2010]
- Personal Genomes in Clinical Care. Quake paper Falls Short! [Last Updated On: May 13th, 2010] [Originally Added On: May 13th, 2010]
- Last post edited by Drew [Last Updated On: May 13th, 2010] [Originally Added On: May 13th, 2010]
- GateKeeper? FCUK U! [Last Updated On: May 13th, 2010] [Originally Added On: May 13th, 2010]
- GateKeeper? F! U! [Last Updated On: May 15th, 2010] [Originally Added On: May 15th, 2010]
- Potential of genomic medicine, LOST [Last Updated On: May 19th, 2010] [Originally Added On: May 19th, 2010]
- How Bad Can a House Investigation be for DTC Genomics? [Last Updated On: May 20th, 2010] [Originally Added On: May 20th, 2010]