Deep Learning Shows How Genetic Motifs Conduct the Music of Life – Technology Networks

Posted: January 29, 2021 at 11:38 am

Our genetic codes control not only which proteins our cells produce, but also to a great extent in what quantity. This ground-breaking discovery, applicable to all biological life, was recently made by systems biologists at Chalmers University of Technology, Sweden, using supercomputers and artificial intelligence. Their research, which could also shed new light on the mysteries of cancer, was recently published in the scientific journal Nature Communications.

DNA molecules contain instructions for cells for producing various proteins. This has been known since the middle of the last century when the double helix was identified as the information carrier of life.

But until now, the factor which determines what quantity of a certain protein will be produced has been unclear. Measurements have shown that a single cell can contain anything from a few molecules of a given protein, up to tens of thousands.

With this new research, our understanding of the mechanisms behind this process, known as gene expression, has taken a big step forward. The group of Chalmers scientists have shown that most of the information for quantity regulation is also embedded in the DNA code itself. They have demonstrated that this information can be read with the help of supercomputers and AI.

You could compare this to an orchestral score. The notes describe which pitches the different instruments should play. But the notes alone do not say much about how the music will sound, he explains.

Information for the tempo and dynamics of the music are also required, for example. But instead of written instructions such asallegroorfortein connection with the notation, the language of genetics spreads this information over large areas of the DNA molecule. Previously, we could read the notes, but not how the music should be played. Now we can do both, states Aleksej Zelezniak.

Another comparison could be that now we have found the grammar rules for the genetic language, where perhaps before we only knew the vocabulary.

What then is this grammar, which determines the quantity of gene expression? According to Aleksej Zelezniak, it takes the form of reoccurring patterns and combinations of the four notes of genetics the molecular building blocks designated A, C, G and T. These patterns and combinations are known as motifs.

The crucial factors are the relationships between these motifs how often they repeat and at exactly which positions in the DNA code they appear.

We discovered that this information is distributed over both the coding and non-coding parts of DNA meaning, it is also present in the areas that used to be referred to as junk DNA.

The researchers tested the method in seven different model organisms from yeast and bacteria to fruit flies, mice, and humans and found that the mechanism is the same. The discovery they have made is universal, valid for all biological life.

According to Aleksej Zelezniak, the discovery would have not been possible without access to state-of-the-art supercomputers and AI. The research group conducted huge computer simulations both at Chalmers University of Technology and other facilities in Sweden.

This tool allows us to look at thousands of positions at the same time, creating a kind of automated examination of DNA. This is essential for being able to identify patterns from such huge amounts of data.

Jan Zrimec, postdoctoral researcher in the Chalmers group and first author of the study, agrees, saying:

With previous technologies, researchers had to tell the system which motifs in the DNA code to search for. But thanks to AI, the system can now learn on its own, identifying different motifs and motif combinations relevant to gene expression.

He adds that the discovery is also due to the fact they were examining a much larger part of DNA in a single sweep than had previously been done.

The new knowledge could also make it possible to better understand how mutations can affect gene expression in the cell and therefore, eventually, how cancers arise and function. The applications which could most rapidly be significant for the wider public are in the pharmaceutical industry.

It is conceivable that this method could help improve the genetic modification of the microorganisms already used today as biological factories leading to faster and cheaper development and production of new drugs, he speculates.

Reference: Zrimec J, Brlin CS, Buric F, et al. Deep learning suggests that gene expression is encoded in all parts of a co-evolving interacting gene regulatory structure. Nat Commun. 2020;11(1):6141. doi:10.1038/s41467-020-19921-4.

This article has been republished from the following materials. Note: material may have been edited for length and content. For further information, please contact the cited source.

Read more here:
Deep Learning Shows How Genetic Motifs Conduct the Music of Life - Technology Networks

Related Posts