bioRxiv | 2019

Towards Practical and Robust DNA-based Data Archiving by Codec System Named ‘Yin-Yang’

 
 
 
 
 
 
 
 
 
 
 
 
 
 

Abstract


Motivation DNA has been reported as a promising medium of data storage for its remarkable durability and space-efficient storage capacity. Here, we propose a robust DNA-based data storage method based on a new codec algorithm, namely ‘Yin-Yang’. Results Using this strategy, we successfully stored different formats of files in one synthetic DNA oligonucleotide pool. Compared to most DNA-based data storage coding schemes presented to date, this codec system can efficiently achieve a variety of user goals (e.g. reduce homopolymer length to 3 or 4 at most, maintain balanced GC content between 40% and 60% and simple secondary structure with the Gibbs free energy above −30 kcal/mol). We tested this codec by an end-to-end experiment including encoding, DNA synthesis, sequencing and decoding. We demonstrate successful retrieval of 2.02 Megabits /3 files using this method. The original information was fully retrieved after sequencing and decoding. Compared to the previously reported methods, our strategy exhibits great potential at achieving high storing capacity per nucleotide (230 PB/gram) and high fidelity of data recovery.

Volume None
Pages None
DOI 10.1101/829721
Language English
Journal bioRxiv

Full Text