DNA encryption is a method for embedding information in the form of nucleic acids. DNA is a remarkably secure medium for information storage, as it is “unplugged,” invisible to the naked eye, and requires molecular biology skills to decipher. As such, DNA encryption is ideally suited for highly secure communication in which protection takes precedence over speed.
The amount of digital information produced is continually growing, whereas digital space is increasingly vulnerable. DNA has emerged as a potential medium for information storage due to its high density, security (e.g., cannot be accessed remotely, requires technical skill to manipulate), stability, and reproducibility. Information can be stored in both the direct sequence and the 3D architecture of the assembled DNA molecules. The DNA encryption platform described here further secures the information encoded in DNA for communication that requires maximal security.
The transfer of encrypted information via nucleic acids is achieved using a platform with two components: 1) an individualized keyboard (iKey) that translates plain text into DNA and 2) Multiplexed Sequence Encryption (MuSE), used to analyze multiple strands by nucleic acid sequencing using common primers. The iKey is a modified QWERTY keyboard with 64 keys, each of which corresponds to one of the 64 codons. The key assignments can be randomized for additional security. The iKey platform is used to generate multiple DNA strands, each of which encodes fragments of the entire message. The multiple nucleic acid strands are co-sequenced using two common primers (forward and reverse), the sequences of which can also be encrypted. Sequencing produces a visual representation of the DNA sample known as a chromatogram. The chromatogram is a series of peaks, each of which corresponds to one nucleotide. The DNA strands are aligned such that their corresponding peaks overlap. If the overlapping nucleotides are identical, they produce a large peak; if they do not match, they produce a small peak. This “chromatogram patterning” is unique and decipherable only to an authorized user. Multiplexed sequencing of different strand combinations produces different readouts; analyzing an incorrect combination of strands or a single strand would yield nonsense or a decoy message. The retrieval process takes minutes or hours to complete, depending on the length of the nucleic acid strands and the method of sequencing.
Unplugged, encrypted information
Can fragment a message across
Information extracted in one step
Incorporates the entire English