The PEPR Exploratory MoleculArXiv, managed by CNRS, is developing new DNA data storage devices. For Patrick Dufour, Director of Strategy and Alliances at Stordata, the pinnacle of this program will represent exciting innovation, responding to several already pressing blocking points. A priority research project which, in his opinion, remains one of the most important projects to be supported to date.
No one has missed scientific research into the possibilities offered by synthetic DNA for information storage. The CNRS posts its progress regularly and there is something to sing like a rooster in the early morning judging by the apparent progress of researchers in the field. It will be interesting to follow the results of the following concrete applications. Partnerships with French institutes such as INA, BNF or European institutions will make it possible to get out of laboratory tests and work with the materials for which synthetic DNA was designed, that is, real and massive amounts of data.
As promising as it is, DNA as a storage solution alone deserves that admiration. It should help solve a large number of mysteries that have already arisen: cloud saturation, excessive consumption of land, especially urban data centers, and their contribution to the depletion of energy resources, but also the problem of shelf life of storage media.
If major cloud service providers strive for some form of carbon neutrality, particularly by purchasing enough green energy, and companies prioritize resource pooling, the ecological balance will remain unbalanced given the perpetual growth of stored data.
Synthetic DNA promises a potential reduction in power consumption 2,500 times less than that of an SSD, right down to consuming nothing in cold archive mode without access. It requires 4 times less space than disks or tapes, and its durability, on a human scale, is unlimited. Enough to preserve all memory of humanity to come.
Covering the need for long-term cold storage
However, we are not there. A strand of DNA is not yet within reach of the company and archivists, as scientists are currently striving to speed up writing speeds and obtain codes strong enough to withstand time and its damaging vagaries of data-altering.
In the current technological state, data integrity is not guaranteed to be respected, whether the data is read as it is written or the data is written in the form you wish. Storage arrays have always had mechanisms of this kind, which DNA algorithms must also provide in order to be truly exploitable.
Moreover, DNA storage is not designed for production data. Transaction data is not intended (by no means instant). The long shelf life it promises makes it suitable for cold storage. The life of an SSD based on flash memory is estimated at an average of 10 years, under ideal storage conditions, given that flash memory should also be updated regularly. Same “punishment” with magnetic tapes, which tape robots must read regularly (every two or three years) to maintain their attractiveness. It is a process that remains costly in terms of time and energy.
The promise of ADN is indeed to overcome this risk of data loss over very long periods. Although this perspective has not yet been fully confirmed, all work today is based on the initial success of sequencing the genome of mammoths discovered in the 4,000-year-old Siberian permafrost.
Waiting for DNA storage
All sides pressed to speed up environmental initiatives, while encouraging data collection and use, sharing it better, and figuring out how to monetize it, companies are caught between contradictory orders.
Today, only 2 out of 10 companies are considered datacenters, meaning they have truly changed the way they operate and innovate through the prism of live data traversing their entire ecosystem. For the vast majority who have not yet reached this condition, the challenge is maintenance while waiting to learn how to treat. However, hoarding almost always ends up being a huge financial burden, leading to trade-offs that can be unfortunate. This is why hard disk (via object storage) and tape never said their last word, these two mediums still show the lowest bill, in spite of everything.
Flash memory, which is more expensive to purchase, allows for a clear reduction in power requirements, takes up less physical space and reduces heat release. Given its price, all storage array vendors worthy of the name have also introduced many data-efficiency devices by reducing the size consumed (compression, deduplication, similarity searching, etc.), and reducing the need for storage space and power, all boosted by the power of the new processors. .
On the other hand, other storage devices, which have been studied for a long time, are still less successful. We’ll think especially about MRAM (magnetic rather than electric charges) which doesn’t really work.
Like it or not, size and performance remain the keywords for storage today. Companies face a form of emergency not known to exploratory research. We can only support the work that has begun and call for more coordination and structuring around this project at the European level. Remember, there are other alliances, particularly North America, that are also quite advanced with other DNA technologies, and could potentially outpace French research, even if the two approaches are complementary.