Skip To Content
Cambridge University Science Magazine
The research published in Nature this week, describes how the team used DNA to encode computer files, including Shakespeare’s sonnets, an audio clip from Martin Luther King’s “I have a dream” speech, and a photo of the European Bioinformatics Institute (EBI), where they did the groundbreaking work.



The media we use to store data, such as floppy discs and CDs, becomes redundant as technology develops. This means we regularly have to transfer data from one medium to the next to ensure that it is not lost. In nature, the four bases of DNA – A, C, T and G – have been used to store and pass on complex information in all living organisms for millions of years.

The synthetic DNA was produced in California and shipped back to the UK, where it was sequenced. Remarkably, Goldman’s team was able to read the DNA sequence and reproduce the files with 100% accuracy.

Although scientists have been able to edit, copy and store DNA for several years, producing large amount of synthetic DNA precisely has not been done before. Every part of the digital file was encoded in four overlapping DNA sequences, so that any mistakes in reading the code would be easily spotted.

The technology used to read DNA sequences is constantly improving but DNA sequence itself does not change in stable conditions. This means that information stored in DNA now could still be read in thousands of years, in the same way that the genome of the pre-historic woolly mammoth was sequenced and mapped in the last decade.

Using DNA data-storage is expensive today, but the Hinxton scientists predict that if the cost of DNA sequencing continues to drop like it has to date, their method will be cost-effective in less than a decade.

doi:10.1038/nature11875

Written by Hinal Tanna.