Getty Images

DNA storage to tackle massive archives

DNA storage could provide dense, flexible, more sustainable form factors for archival use cases. One company plans on commercializing its offering by 2026.

As storage demands increase and as companies look for ways to reduce their carbon footprint, new products that offer an alternative to traditional storage systems might find enterprise interest.

One such alternative is DNA storage, which breaks down data into binary code and then translates it into a format that can be used to encode the data onto synthetic DNA. This results in a dense form factor that is highly durable and persistent.

Murali Prahalad, president and CEO, IridiaMurali Prahalad

Startup Iridia Inc., founded in 2016 and headquartered in Carlsbad, Calif., is planning to commercialize its DNA storage-as-a-service offering for archives and cold data storage in 2026. Murali Prahalad, president and CEO of Iridia, said the compact storage service his company is developing will offer a lower total cost of ownership and smaller carbon footprint compared with tape and hard drives.

In this Q&A, Prahalad talks about what DNA storage is, how it stacks up against other enterprise storage media and why Iridia is focusing on selling a service rather than selling the media outright.

Editor's note: The following interview was edited for length and clarity.

Define DNA storage.

Murali Prahalad: The ability to write any file, not using magnetic particles on disk or tape. The information is now embedded in the structure of a synthetic DNA molecule itself. In both instances, we are storing information on a physical thing. With DNA, we think it is more dense, durable, cost-effective and has many other advantages.

How does it work?

Prahalad: This began with a paper written by George Church, [Yuan Gao and Sriram Kosuri] in 2012. The basic idea is that you can take any file -- whether it's structured or unstructured data -- reduce it to its binary code, and then translate those zeros and ones to the four chemical letters of DNA: A, C, G and T. Synthetic DNA has been made for over three decades for use in molecular biology applications. ...

[Synthesizing DNA storage] is part of a natural process that doesn't require the kind of energy or rare metals that are needed in magnetic media.
Murali PrahaladPresident and CEO, Iridia

You can essentially break apart your file into a series of binary [code], then chemically synthesize that binary [code] in the form of DNA. Once done, those strands of DNA can be stored in a range of conditions. ... When you want to read any subset of that information, you can retrieve the files of interest, sequence the contents and reconstruct the binary code in your file.

Does this make density the main advantage for DNA storage or are there other advantages?

Prahalad: Density is one advantage, but let's look at energy. You can't only think of operating power -- you also have to think about embedded carbon in the manufacture of the media. ... If you go by operating carbon alone, you get a misinformed view of what the footprint of these technologies really are. Even relative to quote-unquote lower operating energy systems, DNA wins. [Synthesizing DNA storage] is part of a natural process that doesn't require the kind of energy or rare metals that are needed in magnetic media.

The second piece builds into manufacturing and carbon footprint. The half-life of DNA in nature is estimated to be 521 years. ... The average half-life of a hard drive is five years. Part of the problem is that if you put data on a hard drive, you have to remaster it every five years ... and the manufacturers only guarantee the drive for that long. With tape, you have to remaster every 10 years. This constant remaster cycle involves more production and more energy.

For DNA, if you store it in a space devoid of water, oxygen and ozone, it could [last] millions of years, or the life space of the archive, with no need to remaster.

A third main advantage is the ease of making copies. It's a very simple process called polymerase chain reaction -- PCR -- to amplify DNA. Today, if I want to replicate an exabyte, I have to get a bunch of 18-wheelers of tape that I have to copy and move. [With PCR,] you can put something like that in a shoebox and create multiple redundant copies anywhere in the world.

How would DNA storage be used in the enterprise?

Prahalad: Archival storage. About 80% of the world's data is stored in tier 2 or tier 3 levels of storage that are currently hard drives and tape. These are write once, read many -- so [it would be used for] archival primarily, and data sets that have to be stored for long periods of time.

Would this be an alternative to or in addition to tape storage for the enterprise?

Prahalad: The market is so large, I think it'd be silly to say that any single technology is going to cover everything. I have never seen markets of this scale where only one solution does everything. Instead, we need to look at where the secular trend will push us. The fundamental movement toward DNA will [happen] not only because it is scalable, if done properly, but [because] there will be more attention on data centers [and their] carbon footprint.

While optical disk has broken its layer limitation, it doesn't have the infrastructure support that tape does with libraries and so has not yet become a popular alternative for enterprise archiving. Are there specific infrastructure or interface connections needed for DNA storage?

Prahalad: If I look at the world of optical disk, magnetic hard drive or magnetic tape, there is a fundamental separation between the media, the device and the data. In the world of DNA, the data and the media are fundamentally the same thing. DNA doesn't require a particular shape in order to maintain its memory holding capacity. Once you synthesize that DNA, you can store it in this hypercompact form that takes almost any shape.

Talk about DNA's performance. For reference, what product on the market now could you compare it to?

Prahalad: Take Amazon Glacier, where you have valuable data that you want to store for a long time. If I store in a service like Glacier, depending on the service level, it takes 24 to 48 hours to recover my data. Customers don't care if the data is on tape or hard drive. If I can guarantee the same service-level agreement with DNA, no one is going to [be bothered], as long as there is fidelity.

What [Iridia] wants to build is an enterprise-level, viable, profitable storage-as-a-service business that can meet or beat existing service-level agreements today. If I'm storing in Glacier, I'm not asking what my read or write time on a drive is. I just want to know what's there. When I want to access the data, within 24 to 48 hours, it is there at my fingertips.

Some people assume that just because you have a certain device standard in hard drives and tape, those performance metrics must be met by DNA in exactly the same way. I think that's a bit of a misnomer.

Tape and optical disk have a low price per terabyte. How does DNA storage compare?

Prahalad: I think the place for DNA is as a service. If a customer wants to store data for a certain period of time, we look at total cost of ownership for that time. ... Looking at the way storage systems work, the current [as-a-service] incumbents charge to write data to their systems, they charge exorbitantly to store over time, and then if you want to remove your data, they charge an egress fee. We will charge you a little bit more to write, but it will still be five to 10 times less than your current total cost of ownership for that period. If you ever want to remove your data, we'll charge the same fees as the other guys and give you the same access in 24 to 48 hours.

Adam Armstrong is a TechTarget Editorial news writer covering file and block storage hardware and private clouds. He previously worked at StorageReview.com.

Dig Deeper on Archiving and tape backup