Design and analysis of DNA Binary Cryptography Algorithm for Plaintext

— Due to the rapid growth of computer networks, more sensitive information is being exchanged over such networks. Securing information from unauthorized parties has become a critical issue in the field of securing information technology. DNA cryptography is a promising technology in the cryptographic field which enables users to encrypt data securely based on the real biological DNA strands. Although there are many DNA encryption algorithms, there is still room for security improvement. In this paper, we introduce a symmetric DNA binary cryptography algorithm to encrypt and decrypt plaintext information. the contribution of this paper are twofold: First, we introduce a mathematical algorithm to generate strong secret key from the DNA of different multiple living creatures. Second, the encryption process is implemented using another 16 keys which randomly generated from the secret key. The efficiency and confidence of the proposed algorithm is examined in terms of encryption and decryption time, avalanche effect and the resistance of the secret key against the attack.

Watson and Crick in 1953 presented a model of DNA consisting of two twisted taps in the form of a helical ladder in which one of the nitrogen bases in one snail is connected to the other's nitrogenous base by hydrogen bonds. The adenine in one of the tapes is always associated with thymine in the other tape with hydrogen bonds and the cytosine in one of the tapes with the guanine in the other tape by using three hydrogen bonds [19].
Research work is being done on DNA Computing either utilizing test tubes (organically) or mimicking the tasks of DNA utilizing PCs (Pseudo or Virtual DNA processing). The first to use DNA in the calculation was Adleman [20] in 1994 to solve the problem of finding the shortest path. It was found that acid has a parallel processing property that provides very high speed if used in arithmetic calculation. In 1995, Boneh et al. [21] used DNA to break the standard data encryption system (DES). Many attempts [22] [23] [24] to build acidbased data encryption systems DNA has been used to randomize the sequence of the nitrogen bases of the acid as a key to the one-time system.
Gehani et. al., presented the primary trial of DNA based Cryptography in which a substitution technique utilizing libraries of particular one time cushions, every one of which characterizes a particular, arbitrarily created, combine shrewd mapping and a XOR conspire using atomic calculation and ordered, arbitrary key strings are utilized for encryption [14]. Other methodologies used an image encryption algorithm based on Sequence Addition Operation, this is done by DNA sequence matrix, DNA sequence addition us Logistic maps and complementarily. This algorithm was adopted again by using DNA sequence matching here the data converted into pointers according to DNA strand taken and key send to the receiver in a secure channel [25].
The contributions of our paper are as follows: most of symmetric DNA algorithms which are introduced in the literature use a secret key from single living creature. In this paper we generate a secret key from multiple DNA of living creatures. On the other hand, we introduce a mathematical method to derive the secret key. In addition, the encryption process will be conducted using another 16 keys generated from the secret key. The remainder of this paper is organised as follows: Section II introduces the proposed encryption algorithm in details. The performance of the proposed algorithm is introduced in Section III. Finally, the conclusion is drawn in Section IV.
II. PROPOSED ALGORITHM The proposed Symmetric DNA Binary encryption (SDB) algorithm uses both DNA (Deoxyribo Nucleic Acid) and binary system to converts the plaintext into cipher format. It is a block cipher with block size of 64 nucleotides which represent 16 alphabet characters. DNA has two long strands of nucleotides and each nucleotide is made of deoxyribose sugar, phosphate group and a nitrogenous base. Nitrogenous bases are A (Adenine), G (guanine), C (Cytosine) and T (Thymine). These 16 nucleotides involved in DNA synthesis set S A, C, G, T . The SDB algorithm consists of the two sub algorithms SDB1 and SDB2 used for encrypting a plaintext message (pMsg) and decrypting a ciphertext message (cMsg) respectively as shown in Figure 1. Where denoting the main private key. The privKey generates sub private keys 1,2,3, … ,16 which are used through the encryption and decryption processes. The values of these keys will be described in section B. In the proposed SDB algorithm, the privKey is the only information which will be kept secret at both the sender and the receiver sides. In addition, the value of the sub keys are dynamically derived at the start of algorithms SDB1 and SDB2. In fact, the SDB algorithm encrypts a pMSG message in 16 stages using SDB1 and decrypts cMsg using the same stages but in reverse order.

A. Main Private Key
The SDB algorithm uses a privKey as main key to encrypt and decrypt the plaintext. The value of masKey is given by the DNA sequence … with length ∈ , where ∈ , 1,2,3, … , andΩ 256, 512, 1024, ⋯ . The power of this algorithm comes from the fact that the is not taken from the DNA of one living creature. Instead, it consists of ∈ sequences (i.e. segments) chosen randomly from the DNA of different living creatures, where 1, 2, 3, … . Note that the value of k is also chosen randomly. If s denoting the length of segment 1,2,3, … , then The length of segment is calculated depending of the parameters and according to the following cases: Case 1:If the number of chosen segments can be represented in a form 2 , m is an arbitrary natural number, and then is given as Case 2:If the number of chosen segments can not be represented in a form 2 then the lenghts of segments out of k are given as Where is the nearest least number to which can be represented in the form of 2 for any value of m and . The lengths of the remaining segments out of are given for 1, 2, … , as follows Where is the nearest largest number to which can be represented in the form of 2 for any value ofm. To clarify the process of selecting the lengths of chosen segments, let us give the following example: 1) Let N 128 and k 7.
2) The values of k , k and k are calculated as 4, 3 and 8 repectivey.
3) Since k 7 then k can not be represented in form 2 for any value of m.
4) The lengths of the first k k 4 3 1 segments are caculated using (2) to give one segment of length 32 nucleotides. 5) The lengths of the remaining 7 4 3 6 segments are calculated using (3) to give six segments of length 16 nucleotides. 6) The total lengths of 7 segments is 32 6 16 128 nucleotides which equal to .

B. Sub Private Keys
The sub private keys 1,2,3, … ,16 are derived from the privKey during the encryption and decryption processes as follows. Let , , and denoting the occurrence numbers of nucleotides A, C, G and T in the DNA sequence of the , respectively, where and the length ∈ 256,512,1024, ⋯ . Let , , , … , , , , , … , , , , , … , and , , , … , are four sets which represent the positions number of nucleotides A, C, G and T within the , respectively. Convert each element in , , and into their equivalent 8 binary form. Concatenate, from right to left, all the binary elements of , , and as one binary string . Divide into 1,2,3, … ,16 blocks with equal lengths, where the length of each block is given as Finally, take for 1,2,3, … ,16.

C. Data Dependency Segment
The concept of data dependent is used extensively in encryption and decryption algorithms which increase the unpredictability of cipher text. In this paper, we generate a Data Dependency Segment (DDS) with length 16 bits for the entir palintext. We use the DDS in both rotation and XORing processes througth encryption process. The DDS is generated as follows: 1) Convert pMSG to its equivalent binary form, where the ASCII code of each character is converted to its equivalent 8 binary bits. 2) Divide the resulted binary string into n blocks of length 128 bits as follows: Note that, the length of the last block is less than 128 if 128 0.

4) For
1,2,3, … , , convert each segment to its equivalent decimal value . Generate the numeric value Z as (5) Where represents the weight of segment within its block. 5) Convert the numeric value to its equivalent binary form with length 16 bits. If the length of is less then 16 bits then is padded with zeros from left. The final form of represent the DDS for block . 6) Repeat steps 3 to 6 to generate the DDS for all the remaining blocks. 7) The final DDS for the entire plaintext, , is drived as follows.
i. Case 1: If 1then . ii. Case 2: If 1then . The value of is calculaed as follows. Take as initial value and then apply the following recursive equations:

D. Encryption Process
The encrypting procedure uses the SDB1 algorithm to convert he original information or plaintext to encrypted form or a cipher text. The SDB1 algorithm uses the following steps: 1) Generate the main key as described in section (A).
3) Convert pMSG to its equivalent binary form, where the ASCII code of each character is converted to its equivalent 8 binary bits. 4) Divide the resulted binary string into n blocks, , , , … , , using (5). 5) Generate the value of that represent the DDS for the entire pMSG. 6) Concatenate value of to the right side of each block , 1,2,3, … , (i.e., ). 7) Take the 16 least significant bits (LSBs) of block as and the remaining bits as . 8) Label the bits of from left to right as where ∈ 0,1 , 1,2,3, … ,| | and| | denotes the length of a string . 9) Label the bits of from left to right as where ∈ 0,1 and 1,2,3, … ,| | 10) Label the bits of from left to right as where ∈ 0,1 , 1,2,2, … ,16 and 1,2,3, … ,| |. 19) Take the last output of each block and convert it to hexadecimal form which represents the cipher message, cMSG, of the plaintext pMSg.

E. Decryption Process
As aforementioned, the sub algorithm SDB2 is used to restore the plaintext, pMSG, from the cipher text, cMSG, as follows.
1) Load the main key.
3) Load the cipher text cMSG. 4) Convert cMSG to its equivalent binary form, where each character is converted to its equivalent 4 binary bits.

5) Divide it into blocks,
, , , … , , as follows. 9) Reconstruct the block i as again. 10) Repeat steps 6 and 7. 11) Take the 16 LSBs of block as and the remaining bits as . 12) Perform Exclusive OR of with the sub private keys based on the bits value of using the following algoritm 8. 13) Reconstruct the block as again. 14) For the remaining block, repeat steps from 6 to 13. 15) Repeat steps from 6 to 14 n times. 16) For each block, remove the DDS from right and then convert each 8 bits to their decimal value and then substitute each decimal value by its corresponding alphabet character as defined in ASCII table. 17) The final form represents the plaintext pMSG.
III. PERFORMANCE ANALYSIS In this section, we evaluate the performance of the proposed encryption algorithm in terms of the resistance against the attack on both main and sub keys, the encrypted time and decrypted time and Avalanche test. The proposed SDB1 and SDB2 algorithms are conducted using JAVA platform. The main key is formed from the Public DNA taken from the European Bioinformatics Institute's (EBI's) nucleotide archive. As aforementioned, the main key is taken as a combination of multiple DNA segments from different living creature which generates strong main key. Since the generation of the sub keys , 1,2,3 … ,16 depend on the main key, changing any segment in the main key will create entirely different sub keys.

A. Keys Force Attack
Its highly impossible for the force attack analysis to recovery the main key for the following reasons: 1) The DNA public database has millions of nucleotide sequences.
2) The length of the main key is chosen randomly from the elements of an infinite set 256,512, ⋯ .
3) The main key consists of multiple DNA segments taken randomly from the DNA of different living creature. These segments are concatenating in random way to form the main key. So, if the segments k are taken from the set living creatures, then the number of permutation is given as ! ! ! .
As aforementioned, each sub key Γ !

B. Encryption and Decryption Time
The time necessary to converts the plaintext into the ciphertext is called encryption time. This time is subject on the message block size and the main key size, and denoted in milliseconds. On the other hand, the decryption time is necessary to convert the plaintext from cipher text. It is desirable that the decryption time must be smaller than or equal to the encryption time. In this algorithm, the encryption time, , consists of the following components: 1) The time, , required to generate the main key. 2) The time, , required to generate the sub private keys , 1,2,3 … , 16.
3) The time, , required to encrypet the palintext. Hence, the total encryption time is given as .
On the other hand, the decryption process is implemented by excusing the same steps of the encryption process in reverse order expect that the step of generating the main key is not executed. In such case, the decryption time, , is given as .

C. Avalanche Effect
The avalanche measures the correlation between ciphertext bits and both plaintext and main key bits. In other words, the avalanche test measures the change in the cipher text when some bit in plaintext or main key is changed [26]. First, we retain the main key to the constant value to all plain text as shown in Figure 2. The plaintext "Multiprogramming" is encrypted using the selected main key and then we use o instead of g, v instead of r, A instead of a and M instead of m. For the plaintext "Computer Organization", we use N instead of n. For "5555333388887777 5555333388887777", we use 1 instead of the last 7. Finally, for "AA112233445566FE AA11223344556666", we use 5 instead of the last 6. Table 1 illustrates the avalanche test results for 4 simple plaintext. Results show that the proposed SDB algorithm exhibits strong avalanche property.

IV. CONCLUSION
In this paper, a symmetric DNA binary cryptography algorithm to encrypt and decrypt plaintext information is introduced. Mathematical algorithm is introduced to generate a secret key from the DNA of different multiple living creatures. From the main key, 16 sub keys are randomly generated. The efficiency and confidence of the proposed algorithm is examined in terms of encryption and decryption time, avalanche effect and the resistance of the secret key against the attack. The experiment test shows that the proposed encryption algorithm has very good avalanche test and private keys force attacks.