AUSTIN (KXAN) – A team at the University of Texas at Austin researching storing data on DNA has uncovered a new method for recovering damaged data. They are just some of the researchers around the world, including teams working for tech giants Microsoft and Google, who are developing new methods to store data on DNA.

How does DNA data storage work?

Data is stored on computers by converting an item, say a picture, into a sequence of ones and zeroes. Computers are able to read that sequence and recreate the picture. Magnetic data storage, the type we use on our computers and phones, is able to store this data.

DNA data storage works in much the same way. Instead of converting a picture into ones and zeroes, it is converted into C-A-T and G. These letters represent the nucleotides that make up the ladder part of a DNA strand. Using this sequence, scientist are able to make a DNA strand in a lab.

“There are now companies where you can type in all four letters of the genetic code into a little text window and they will ship you a little envelope with your DNA,” says Ilya Finkelstein, an associate professor of molecular biology at the University of Texas.

The advantages of DNA storage

Why are researchers around the world working on DNA storage? First, it’s durable. DNA can last thousands of years. Second, it’s universal. The DNA we have in the United States is the same DNA around the world. Third, it’s already encrypted. You need a sequencer to read it.

Finally, it’s compact. You can store two billion gigabytes of data in a gram of DNA. “As long as it’s in a cold airtight dark room, you don’t need a massive server farm with the cooling and carbon footprint associated with it,” says Professor Finkelstein.

In June 2019, scientists reported they had successfully stored all of Wikipedia on a single strand of DNA. As part of UT’s research, the team stored a copy of The Wizard of Oz, translated into the universal language Esperanto, onto a strand of DNA.

University of Texas’ research in DNA data storage

The team of researchers at UT, consisting of Finkelstein, William Press, Stephen Jones and John Hawkins, explored making DNA data storage more viable. DNA code is prone to errors. Current data storage errors usually occur when a one or zero is flipped. This can be corrected easily. When DNA data storage has an error, usually it means a letter is missing. This can cause the code to shift and mean something completely different.

The UT team developed an algorithm that can detect when a piece of DNA code is missing or if one has been added. With this algorithm, less redundancy is needed within the code. Meaning more code can be stored on a strand of DNA and it takes less time for a sequencer to read the strand.

DNA data storage is years away from becoming common. Currently, it is expensive to manufacturer and sequence DNA code. Additionally, software is still being developed that can read the code. According to Professor Finkelstein, the earliest we will see DNA data storage in consumer products will be 20 years.