Data storage using peptide sequences

Nat Commun. 2021 Jul 13;12(1):4242. doi: 10.1038/s41467-021-24496-9.

Abstract

Humankind is generating digital data at an exponential rate. These data are typically stored using electronic, magnetic or optical devices, which require large physical spaces and cannot last for a very long time. Here we report the use of peptide sequences for data storage, which can be durable and of high storage density. With the selection of suitable constitutive amino acids, designs of address codes and error-correction schemes to protect the order and integrity of the stored data, optimization of the analytical protocol and development of a software to effectively recover peptide sequences from the tandem mass spectra, we demonstrated the feasibility of this method by successfully storing and retrieving a text file and the music file Silent Night with 40 and 511 18-mer peptides respectively. This method for the first time links data storage with the peptide synthesis industry and proteomics techniques, and is expected to stimulate the development of relevant fields.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Animals
  • Databases, Protein*
  • Humans
  • Proteomics / methods
  • Software*