2013 February 7 | Nick Goldman, Paul Bertone, Siyuan Chen, Christophe Dessimo, Emily M. LeProust, Botond Sipos, Ewan Birney
This study presents a scalable method for storing digital information in synthetic DNA, achieving high capacity and low maintenance. The researchers encoded 739 kB of digital data, including text, scientific papers, photographs, and audio files, into DNA. The data was synthesized, sequenced, and accurately reconstructed with 100% fidelity. The DNA storage system uses error-correcting codes and redundancy to ensure data integrity, and the synthetic DNA is designed to avoid homopolymers, making it easily identifiable as non-biological.
The DNA storage method is highly efficient, with an encoding efficiency of 88% for megabyte-scale data. Theoretical analysis shows that the system can scale beyond current global data volumes, and the cost per unit of information stored is expected to decrease significantly with technological advances. The study also demonstrates that DNA storage is robust, with error rates increasing only slowly as data volumes grow. The DNA storage medium requires no active maintenance beyond a cold, dry, and dark environment, making it suitable for long-term archival.
The study highlights the potential of DNA storage for long-term, low-access digital archives, such as government and historical records, and scientific data storage systems like CERN’s CASTOR. The researchers estimate that DNA storage could become cost-effective for archives with a 600–5,000-year horizon. With further technological advancements, DNA storage may become practical for sub-50-year archives. The study also notes that DNA storage is a viable solution for digital archiving due to its high storage density, long-term stability, and potential for efficient data copying.This study presents a scalable method for storing digital information in synthetic DNA, achieving high capacity and low maintenance. The researchers encoded 739 kB of digital data, including text, scientific papers, photographs, and audio files, into DNA. The data was synthesized, sequenced, and accurately reconstructed with 100% fidelity. The DNA storage system uses error-correcting codes and redundancy to ensure data integrity, and the synthetic DNA is designed to avoid homopolymers, making it easily identifiable as non-biological.
The DNA storage method is highly efficient, with an encoding efficiency of 88% for megabyte-scale data. Theoretical analysis shows that the system can scale beyond current global data volumes, and the cost per unit of information stored is expected to decrease significantly with technological advances. The study also demonstrates that DNA storage is robust, with error rates increasing only slowly as data volumes grow. The DNA storage medium requires no active maintenance beyond a cold, dry, and dark environment, making it suitable for long-term archival.
The study highlights the potential of DNA storage for long-term, low-access digital archives, such as government and historical records, and scientific data storage systems like CERN’s CASTOR. The researchers estimate that DNA storage could become cost-effective for archives with a 600–5,000-year horizon. With further technological advancements, DNA storage may become practical for sub-50-year archives. The study also notes that DNA storage is a viable solution for digital archiving due to its high storage density, long-term stability, and potential for efficient data copying.