'Deep learning' finds autism, cancer mutations in unexplored regions of the genome

Posted: December 18, 2014 at 3:44 pm

PUBLIC RELEASE DATE:

18-Dec-2014

Contact: Lindsay Jolivet lindsay.jolivet@cifar.ca 416-971-4876 Canadian Institute for Advanced Research @cifar_news

Scientists and engineers have built a computer model that has uncovered disease-causing mutations in large regions of the genome that previously could not be explored. Their method seeks out mutations that cause changes in 'gene splicing,' and has revealed unexpected genetic determinants of autism, colon cancer and spinal muscular atrophy.

CIFAR Senior Fellow Brendan Frey (University of Toronto) is the lead author on a paper describing this work, which appears in the Dec. 18 edition of Science Express. The paper was co-authored by CIFAR senior fellows Timothy Hughes (University of Toronto) and Stephen Scherer (The Hospital for Sick Children and the University of Toronto) of the Genetic Networks program. Frey is appointed to the Genetic Networks program, and the Neural Computation & Adaptive Perception program. The research combines the latter groups' pioneering work on deep learning with novel techniques in genetics.

Most existing methods examine mutations in segments of DNA that encode protein, what Frey refers to as low-hanging fruit. To find mutations outside of those segments, typical approaches such as genome wide association studies take disease data and compare the mutations of sick patients to those of healthy patients, seeking out patterns. Frey compares that approach to lining up all the books your child likes to read and looking for whether a particular letter occurs more frequently than in other books.

"It doesn't work, because it doesn't tell you why your kid likes the book," he says. "Similarly, genome-wide association studies can't tell you why a mutation is problematic."

But looking at splicing can. Splicing is important for the vast majority of genes in the human body. When mutations alter splicing, genes may produce no protein, the wrong one or some other problem, which could lead to disease.

Frey's team, which includes researchers from engineering, biology and medicine, developed a computer model that mimics how the cell directs splicing by detecting patterns within DNA sequences, called the 'splicing code'. They then used their system to examine mutated DNA sequences and determine what effects the mutations would have, effectively scoring each mutation. Unlike existing methods, their technique provides an explanation for the effect of a mutation and it can be used to find mutations outside of segments that code for protein.

To develop the computer model, Frey's team fed experimental data into machine learning algorithms, so as to teach the computer how to examine a DNA sequence and output the splicing pattern.

Go here to see the original:
'Deep learning' finds autism, cancer mutations in unexplored regions of the genome

Related Posts