DNA For Data Storage

Two researchers, Yaniv Erlich and Dina Zielinski were able to successfully encode 2MB of data into strands of DNA. Erlich is a computer science professor at Columbia Engineering and Zielinski is an associate scientist at the New York Genome Center (NYGC) used an algorithm originally designed for streaming video to store 6 files into strands into strands of DNA. The 6 files selected were: an entire operating system (KolibriOS), an 1895 French film – “Arrival of a train at La Ciotat”, a computer virus, a $50 gift card from Amazon, Claude Shannon’s 1948 paper – A Mathematical Theory of Communication, and a Pioneer Plaque (plaques launched into space in 1972 with various universal measurements).

DNA was chosen as the storage medium is ultra-compact and it doesn’t degrade over time like CDs or DVDs. DNA can last up to hundreds or thousands of years if the conditions are favorable (i.e. cool and dry). They combined all data into a master file and separated the data into binary code strings which was then rewritten with the streaming video algorithm (called fountain codes), and then sorted into smaller packets called droplets which were then imprinted into the 4 bases of DNA. In the end, they generated 72,000 DNA strands, each 200 bases in length.

The information was sent in a text file to a startup company called Twist Bioscience who turned the file into DNA molecules which could be read by a DNA sequencer. The test was so successful that even copies of copies turned out to be without any errors in any situation. Erlich believes DNA is the highest density data storage mankind has created, with at least 215 petabytes (or 215,000,000 gigabytes) per gram of DNA. The only downside is that this process came in at a total cost of $9,000 to create and read the DNA file.

Learn More:



  1. This is definitely GSN. But we won’t be having DNA Hard drives in our computers anytime soon. And not just the cost factor. DNA isn’t last enough for real time modifications. and PCR(Polymerase chain reaction) which they use to duplicate the DNA take cycles of heat and cooling. Also don’t forget that since we will need to use proteins to read and write the data it will need be within a very specific temperature range. Much more limited range than silicon. But as a long term storage mechanism I can see it happening. in cases where the spending a tens of thousands of dollars for backup of data is worth it.

Leave a Reply