Overview

Challenge

The demand for data-driven decision making coupled with need to retain data to meet regulatory compliance requirements has resulted in a rapid increase in the amount of archival data stored by enterprises. As data generation rate far outpaces the rate of improvement in storage density of media like HDD and tape, researchers have started investigating new architectures and media types that can store such “cold”, infrequently accessed data at very low cost.

Synthetic dna

Synthetic DNA is one such storage media that has received some attention recently due to its high density and durability. DNA possesses three key properties that make it relevant for archival storage. First, it is an extremely dense threedimensional storage medium that has the theoretical ability to store 455 Exabytes in 1 gram; in contrast, a 3.5” hard disk drive can store 10 Terabytes and weighs 600 grams today. Second, DNA can last several centuries even in harsh storage environments; hard disk drives and tape have life times of five and thirty years. Third, it is very easy, quick, and cheap to perform in-vitro replication of DNA; tape and hard disk drive have bandwidth limitations that result in hours or days for copying large Exabyte-sized archives.

Proof of concept experiments and results

In this three year project (€3M funded by the EU), we will research all relevant aspects of DNA storage in a consortium of six partners across three countries (UK, France, Ireland) bringing together all necessary expertise. We will research all relevant technologies such as encoding different types of data in DNA, scalable DNA synthesis to store data, experimental techniques to manipulate the data, efficient sequencing and decoding approaches to read back the data as well as automation of all aspects. The final result will be an end-to-end prototype for storing data in DNA and for reading it back.

In initial work we have developed OligoArchive, an architecture for using DNA-based storage system as the archival tier of a relational database. We demonstrate that OligoArchive can be realized in practice by building archiving and recovery tools (pg_oligo_dump and pg_oligo_restore) for PostgreSQL that perform schema-aware encoding and decoding of relational data on DNA, and using these tools to archive a 12KB TPC-H database to DNA, perform in-vitro computation, and restore it back again.

Our initial results are summarised in the paper available here.

A factsheet/summary of the project can be found here.

Play Video
Play Video

People

Thomas Heinis

Imperial College

PI & Coordinator

Raja Appuswamy

Eurecom

PI

James MacDonalD

Imperial College

PI

Paul Freemont

Imperial College

PI

Pascal Barbry

Universite Nice & CNRS

PI

Marc Antonini

universite nice & CNRS

PI

Sachin Chalapti

Helixworks

PI

Nimesh Pinnamaneni

Helixworks

PI

CONSORTIUM PARTNERS

PUBLications

2023

Yan, Yiqing; Pinnamaneni, Nimesh; Chalapati, Sachin; Crosbie, Conor; Appuswamy, Raja

Scaling Logical Density of DNA storage with Enzymatically-Ligated Composite Motifs Journal Article

In: bioRxiv, 2023.

Links | BibTeX

2022

Pic, Xavier; Antonini, Marc

A constrained Shannon-Fano entropy coder for image storage in synthetic DNA Proceedings Article

In: European Signal Processing Conference (EUSIPCO 2022), IEEE, 2022.

Links | BibTeX

Yan, Yiqing; Chaturvedi, Nimisha; Appuswamy, Raja

Optimizing the Accuracy of Randomized Embedding for Sequence Alignment Proceedings Article

In: International Parallel and Distributed Processing Symposium Workshops (IPDPSW), 2022.

Links | BibTeX

Appuswamy, Raja

Towards Passive, Migration-Free, Standardized, Long-Term Database Archival Journal Article

In: SIGMOD Rec., vol. 51, no. 2, 2022.

Links | BibTeX

Marinelli, Eugenio; Yan, Yiqing; Magnone, Virginie; Dumargne, Marie-Charlotte; Barbry, Pascal; Heinis, Thomas; Appuswamy, Raja

OligoArchive-DSM: Columnar Design for Error-Tolerant Database Archival using Synthetic DNA Journal Article

In: bioRxiv, 2022.

Links | BibTeX

2021

Moore, Omer Sella; Amir Apelbaum; Thomas Heinis; Jasmine Quah; Andrew S W

DNA archival storage, a bottom up approach Conference

ACM Workshop on Hot Topics in storage and File Systems, 2021.

BibTeX

Yan, Yiqing; Chaturvedi, Nimisha; Appuswamy, Raja

Accel-Align: a fast sequence mapper and aligner based on the seed--embed--extend method Journal Article

In: BMC Bioinformatics, vol. 22, no. 1, pp. 257, 2021, ISBN: 1471-2105.

Links | BibTeX

Antonio, Eva Gil San; Heinis, Thomas; Carteron, Louis; Dimopoulou, Melpomeni; Antonini, Marc

Nanopore Sequencing Simulator for DNA Data Storage Journal Article

In: Visual Communications and Image Processing (VCIP 2021), 2021.

BibTeX

Marinelli, Eugenio; Ghabach, Eddy; Bolbroe, Thomas; Sella, Omer; Heinis, Thomas; Appuswamy, Raja

DNA4DNA: Preserving Culturally Significant Digital Data with Synthetic DNA Journal Article

In: 17th International Conference on Digital Preservation (iPRES 2021), 2021.

BibTeX

Marinelli, Eugenio; Ghabach, Eddy; Bolbroe, Thomas; Sella, Omer; Heinis, Thomas; Appuswamy, Raja

Digital Preservation with Synthetic DNA Journal Article

In: 37eme Conference sur la Gestion de Donnees – Principes, Technologies et Applications (BDA 2021), 2021.

BibTeX

Marinelli, Eugenio; Appuswamy, Raja

XJoin: Portable, Parallel Hash Join across Diverse XPU Architectures with OneAPI Proceedings Article

In: International Workshop on Data Management on New Hardware (DaMoN 2021), 2021.

Abstract | BibTeX

Marinelli, Eugenio; Appuswamy, Raja

OneJoin: Cross-architecture, scalable edit similarity join for DNA data storage using oneAPI Conference

International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, 2021.

BibTeX

Chalapati, Sachin; Crosbie, Conor; Limbachiya, Dixita; Pinnamaneni, Nimesh

Direct oligonucleotide sequencing with nanopores Journal Article

In: Open Research Europe, vol. 1, pp. 47, 2021.

Links | BibTeX

Antonio, Eva Gil San; Dimopoulou, Melpomeni; Antonini, Marc; Barbry, Pascal; Appuswamy, Raja

Decoding Of Nanopore-Sequenced Synthetic DNA Storing Digital Images Proceedings Article

In: 2021 IEEE International Conference on Image Processing (ICIP), 2021.

Links | BibTeX

Dimopoulou, Melpomeni; Antonini, Marc; Barbry, Pascal; Appuswamy, Raja

Image storage onto synthetic DNA Journal Article

In: Signal Processing: Image Communication, 2021.

Links | BibTeX

Dimopoulou, Melpomeni; Antonio, Eva Gil San; Antonini, Marc

A JPEG-based image coding solution for data storage on DNA Miscellaneous

2021.

BibTeX

Franzese, Giulio; Yan, Yiqing; Serra, Giuseppe; DÓnofrio, Ivan; Appuswamy, Raja; Michiardi, Pietro

Generative DNA: Representation Learning for DNA-based Approximate Image Storage Proceedings Article

In: International Conference on Visual Communications and Image Processing (VCIP), pp. 01-05, 2021.

Links | BibTeX

2020

Antonini, Melpomeni Dimopoulou; Marc

Efficient Storage of Images onto DNA Using Vector Quantization Journal Article

In: 2020.

Links | BibTeX

Dimopoulou, Melpomeni; Antonini, Marc; Barbry, Pascal; Appuswamy, Raja

Storing Digital Data into DNA: A Comparative Study of Quaternary Code Construction Proceedings Article

In: ICASSP, Barcelona, Spain, 2020.

Links | BibTeX

2019

Appuswamy, Raja; Brigand, Kevin Le; Barbry, Pascal; Antonini, Marc; Madderson, Olivier; Freemont, Paul; McDonald, James; Heinis, Thomas

OligoArchive: Using DNA in the DBMS Storage Hierarchy Proceedings Article

In: CIDR 2019, 9th Biennial Conference on Innovative Data Systems Research, Asilomar, CA, USA, 2019, 2019.

Links | BibTeX

Buterez, David; Heinis, Thomas

Efficient Approximation of Sequence Hybridization Proceedings Article

In: DNA Computing and Molecular Programming, 2019, ISBN: 978-3-030-26807-7.

BibTeX

Ling, Jeremy; Heinis, Thomas

Encoding Information in Primers Proceedings Article

In: DNA Computing and Molecular Programming, 2019, ISBN: 978-3-030-26807-7.

BibTeX

Melpomeni, Dimopoulou; Antonini, Marc; Barbry, Pascal; Appuswamy, Raja

A Biologically Constrained Encoding Solution for Long-term Storage of Images onto Synthetic DNA Proceedings Article

In: EUSIPCO 2019, 27th European Signal Processing Conference, Coruna, Spain, 2019.

Links | BibTeX

Memishi, Bunjamin; Appuswamy, Raja; Paradies, Marcus

Cold Storage Data Archives: More Than Just A Bunch of Tapes Proceedings Article

In: DAMON 2019, 15th International Workshop on Data Management on New Hardware, Held with ACM SIGMOD/PODS, Amsterdam, Netherlands, 2019.

Links | BibTeX

2018

Dimopoulou, Melpomeni; Antonini, Marc; Barbry, Pascal; Appuswamy, Raja

DNA Coding for Image Storage Using Image Compression Techniques Proceedings Article

In: CORESA 2018, 20emes journées d'étude et d'échange sur la COmpression et la REprésentation des Signaux Audiovisuels, Poitiers, France, 2018.

Links | BibTeX

NEWS

Contact