DNA is the future for data storage. That future is coming very soon. – SynBioBeta

Posted: October 16, 2019 at 5:01 pm

We have read how DNA data storage is about to go viral. At SynBioBeta 2019, this was made even more clear after a panel discussion of leaders in the field forecasted that costs for storing information in DNA could drop to $100 per terabyte in as little as five years with the right investment. While challenges remain for automating the DNA reading/writing process, experts are increasingly leaning towards DNA as a long-term information storage solution, particularly for archiving culturally significant data.

A lot of our interactions every day are mired in data transfer, said Henry Lee, co-founder of Kern Systems.

From cat memes to satellite photos, the amount of data were generating worldwide is growing exponentially. The technologies for storing that data are not advancing as quickly. Fortunately, nature has evolved its own elegant solution for information storage: DNA.

DNA stores all of the information required to make a human or a plant in an incredibly tiny package. A small but growing group of scientists is now working to replicate that storage strategy to preserve digital data.

Its all based on translating bits into bases, said Karin Strauss, Principal Research Manager at Microsoft. Every two bits of information translates into one of the four DNA nucleotides. Once the sequences are mapped out in software, the DNA is synthesized

The other half of the DNA storage equation is recovering the bits (or reading the DNA) via sequencing.

Now that we know how to read DNA, well always be able to read it, so its an eternally relevant means of data storage, said Strauss.

DNA synthesis and DNA sequencing technologies were not designed for writing and reading digital information. A lot of energy has gone into making perfect DNA, said Bill Peck, Chief Technology Officer at Twist Bioscience. But we might be able to resolve error-ridden sequences using good software.

Essentially when were making DNA, were actually making millions and millions of the same molecules at the same time, said Lee. In a data storage system, you can use that as redundancy. Data scientists use algorithms to encode redundancy in digital media storage devices like DVDs. That redundancy can be used to correct errors.

We very much can tolerate errors in the DNA and we are willing to give up on some quality for other benefits, said Strauss. The beauty of computer science is that we can still recover the data bit by bit.

So far, the process isnt cheap.

When DNA is synthesized, its essentially printed onto silicon chips, and silicon is expensive. Twist is pushing the limits on how much DNA you can print on a single chip, said Peck, but that innovation is also expensive. The panelists almost unanimously agreed that significant investments are required to make DNA-based data storage a practical reality.

Another significant cost involved in the writing-storage-reading workflow is labor.

There are writers and readers that are fully automated today, said Strauss, but the entire process is not automated. Everything between DNA synthesis and sequencing, such as preparing sequencing libraries, is still done by hand. Liquid handling robots can help, but Strausss team is trying to find ways to automate more affordably, so that the entire process is scalable.

Kern Systems and Molecular Assemblies are working to make synthesis more scalable by innovating the manufacturing process. Theyre focused on enzyme-based synthesis, a change in paradigm from the chemical-based methods weve been using for the last 30 years.

Investment is an issue here too.

Were trying to come up with the ink that will drive the printer to write DNA, said Bill Efcavitch, cofounder of Molecular Assemblies, but were going to need partnerships to engineer those enzymes at scale.

While increased investments are needed to make DNA-based data storage practical at scale, Lee predicts people will start using the technology within the next 2-3 years.

Government agencies could be early customers, said Efcavitch, because they need to store massive amounts of data for long periods of time.

Peck and Strauss agreed that the first use of the technology will likely be archival. There is a lot of intrinsic value in figuring out how to store culturally significant information like music for millennia, said Peck.

Down the line, Lee hopes to see the technology in many more hands. Were interested in how we can miniaturize this, he said. If the technology isnt siloed, then he expects that biohackers will help build additional apps.

Fundamentally, storing digital information in DNA is a very simple idea. When you begin to imagine how the technology might be used in the real world, it gets a lot more complicated.

For instance, when it comes to actually retrieving information that is stored in DNA, you probably dont want to have to sequence an entire library. We need to develop the DNA equivalent of a digital search function. Strausss team is using machine learning to develop search capabilities within molecules.

Right now, the focus has been on cold data data that doesnt need to be accessed very often. DNA sequencers until recently were based on batch processes, said Strauss. But new sequencing technologies such as Oxford Nanopores are more real-time. Real-time sequencing is a step in the hot data storage direction, but we still have a long way to go.

The digital storage world is so new, we really dont know what its going to look like in 5 years, said Peck. Ironically, digital storage is also relatively new, but now things arent considered archived until theyre digitized, so the technology might move faster than we think.

When it comes to hot storage the kind of instantaneous, on-demand access to data that flash drives provide the best way to make things happen is to tell a bunch of scientists and engineers that its impossible, said Peck. So, its impossible

Here is the original post:
DNA is the future for data storage. That future is coming very soon. - SynBioBeta

Related Posts