# Superimposing encryption data.

Much has been written about the necessity of processing data in the encrypted form. However, no satisfactory method of processing encrypted data has been published to date. Ahitub et al. [2] have analyzed the possibilities of using some special algorithms to add encrypted data. Rivest et al. [10] have suggested the use of an algorithm based on homomorphic functions for processing encrypted data. The main limitation of this algorithm is that such functions can be broken by solving a set of linear equations, as noted by [2]. The public-key crytosystem described in [1] can be used to multiply encrypted data but cannot be used to add encrypted data and is therefore not appropriate for some practical applications such as bank transactions. Abadi, Feigenbaum and Kilian [1] presented some general theorems concerning the problem of computing with encrypted data and formulated a framework to prove precise statements about what an encrypted instance hides and reveals; they also described encryption schemes for some well-known functions.

In this article, we explore a way to superimpose data in the encrypted form. Superimposing two sets of numbers means that each set is multiplied by a constant and then both sets are added together to obtain a final sum. That is, if X and Y denote two sets of data, the superposition of the data is given by Z = rX + sY where r and s are constants. Simply adding two sets of data (i.e., r = s = 1) or just multiplying a set of data by a constant (i.e., r or s = 0) are special cases of superimposing data. On the other hand, superimposing data can be considered as a special kind of data processing.

It has been stated that encrypting data based upon time-reversal transformations is efficient [12]. The idea of the model is to employ a second order (in time) equation to encrypt confidential data; the plaintext can be recovered by reversing the iterating steps.

In the next section we review briefly the time-reversal transformation. This will be followed by an exploration of the possibilities of using this idea to superimpose encrypted data. We then present two practical examples to highlight the method, followed by an examination of the code-breaking and homomorphic functions discussed by [2]. We also detail how the model can resist the cryptanalytic attack by data subtraction; the significance of the arbitrary background is stressed.

Time Reversal Transformations

In one dimensional cases, a vector X can be transformed to a new vector X* by a function (key) f according to an equation in the form

[Mathematical Expression Omitted]

In equation (1), x[.sub.i] is the i-th element of the vector X and can take on k discrete values; t labels the t-th (t = 0, 1, 2, . . . ) copy of the sequence x[.sub.i], namely, x[.sub.i] (0), x[.sub.i] (1), . . . x[.sub.i] (t) ... The key f[{x[.sub.i]})] which determines the transformed data X* is a function of x[.sub.i] and its neighboring elements. The most important property of equation (1) is that it possesses time-reversal symmetry [12]. It can be iterated for an arbitrary number of time steps provided that two initial conditions are given:

x[.sub.i] (0) = b[.sub.i], x[.sub.1] (1) = p[.sub.i], (2)

where b[.sub.i] can be an arbitrary background (which of course can also be data to be processed) and p[.sub.i] are the data to be encrypted (i.e., plaintext). If there are n elements in X (e.g., a message with n characters), i ranges from 0 to n - 1.

Consider a very simple example. Suppose we allow each x[.sub.i] (t) to take on values 0, 1, 2 (i.e., k = 3) and want to encrypt the message "00012"; in this example, n = 5, and x[.sub.0] (1) = 0, x[.sub.1], (1) = 0, x[.sub.2] (1) = 0 , x[.sub.3] (1) = 1, x[.sub.4] (1) = 2. Suppose the elements of our background are all 0 (i.e., x[.sub.0] (0) = x[.sub.1] 0) =... = x[.sub.4] (0) = 0) and we have arbitrarily picked our function key f as

[Mathematical Expression Omitted]

As discussed in [12], cyclic boundary conditions should be employed. That is, the arithmetic on the index j of x[.sub.j] (t) is done mod n; if the value of j is smaller than 0 or larger than n - 1, we have to make the transformation j -> j mod n; in the above example, x[.sub.-1] (1) = x[.sub.4] (1), x[.sub.6] (1) = x[.sub.1] (1), etc. With this cyclic boundary condition, the sequence x[.sub.i] (2) can then be calculated by

[Mathematical Expression Omitted]

Thus

[Mathematical Expression Omitted]

and

[Mathematical Expression Omitted]

Similarly, x[.sub.2] (2) = 2, x[.sub.3] (2) = 2, and x[.sub.4](2) = 0. The next time sequence x[.sub.i] 3 is obtained by

[Mathematical Expression Omitted]

and so on.

Decryption is obtained by reversing the steps of iterating equation (1) as it can be expressed in the form

[Mathematical Expression Omitted]

If m specified elements are allowed to be utilized to construct the key f, the total possible number of choices of f is N = k[.sup.k][.sup.m]]. For example, if a plaintext consists of 20 characters, each of which is represented by one of the 256 different generalized ASCII code (i.e., k = 256), and 3 specified elements are allowed to be used to construct f, there exist 256[.sup.256][.sup.3] distinct functions. As another example, if a data block contains 64 bits and all of the bits are allowed to be utilized to construct the function, the number of choices of functions is 2[.sup.2][.sup.64] . These are truly astronomical numbers and no one can obtain the key by trial-and-error. The properties of the arbitrary background and arbitrary number of iterations enhance the security of the scheme.

Superimposing Data

In the following discussions, we always perform mod k operations, k being the number of states of each element in the plaintext or ciphertext. For convenience of writing, we omit mod k in the equations.

Let us consider two vectors X and Y which are encrypted according to the time-reversal-symmetry equations

[Mathematical Expression Omitted]

with initial conditions x[.sub.i] (0) = b[.sub.i], x[.sub.i] (1) - p[.sub.i],

(3a)

[Mathematical Expression Omitted]

with initial conditions y[.sub.i] (0) = [beta][.sub.i], y[.sub.i] (1) = [pi][.sub.i].

(3b)

Let Z = rX + sY where r and s are arbitrary integers. We have

[Mathematical Expression Omitted] (4a)

[Mathematical Expression Omitted] (4b)

If f is a linear function of its arguments, it is obvious that equation (4) can be expressed as

[Mathematical Expression Omitted]

[Mathematical Expression Omitted] (5)

so that z[.sub.i] satisfies the time-reversal transformations with given initial conditions. Equations (4) and (5) imply that we can always superimpose data in the encrypted form and obtain the resultant plaintext by reversing the iterating time-steps provided that the key is a linear function of its arguments. This is basically the principle of superposition which is often used in many fields of physical science (see Discussion section). For example, let us assume that

P = plaintext (confidential data)

T = transaction (to be added to P)

B[.sub.p] = background for P

B[.sub.T] = background for T C[.sub.1] = ciphertext obtained from P and B[.sub.p]

C[.sub.2] = ciphertext obtained from C[.sub.1] and P

T[.sub.1] = transcipher obtained from T and B[.sub.T]

T[.sub.2] = transcipher obtained from T[.sub.1] and T

Z[.sub.1] = C[.sub.1] + T[.sub.1]

Z[.sub.2] = C[.sub.2] + T[.sub.2]

The confidential data Z = P + T and the corresponding background B[.sub.z] can be obtained from Z[.sub.1] and Z[.sub.2] by reversing iterating equation (5). (Sometimes we can use the background B[.sub.z] = B[.sub.p] + B[.sub.T] as a critical check on the correctness of the calculations.) This concept is depicted clearly in Figure 1 [omitted].

Under the constraint that the key f has to be a linear function of its arguments, the number of choices (N) of f has been largely reduced. The total number of distinct keys is N = k[.sup.L+ 1] where L is the number of elements allowed to be used to construct f, and k is the number of states of each element. However, in many applications, N is still large enough to secure the operation. For example, if a plaintext consists of 20 characters, each of which is represented by one of the 256 generalized ASCII codes, the possible number of linear functions that can be constructed is 256[.sup.21] (=2[.sup.168]); this is still an astronomical number and the key cannot be broken by trial and error. The arbitrary backgrounds may be used to enhance the security of the system; in this example there exist 256[.sup.20] different backgrounds. In practice, the background may be just another plaintext. In case high security is required, one can generate the background by a random number generator; the penalty here is that it will take twice the amount of time to process a fixed amount of data when compared to the case using plaintexts as backgrounds. As another example, the number of linear functions that can be constructed from a block of 64 bits is 2[.sup.65] bits; the number is not as large as before but breaking the key now requires tremendous effort.

We will now present two practical examples to illustrate the applications of the model. Interested readers are encouraged to obtain the source code (written in C) and associated figures of these examples from the authors.

Foreign Exchange

The first example we consider is the conversion of U.S. dollars to japanese yen in units of 100 dollars, assuming that 100 dollars are worth 12,689 yen. The number of U.S. dollars to be converted is confidential, while the exchange rate is known to the public. Two dollar accounts are represented by texts, each of which consists of 14 characters, one being the background of the other. For convenience of discussion and data manipulation, we shift each of the ASCII codes by 48 (ASCII code for '0') so that the character '0' is represented by code 0, '1' by 1 and so on. The key function (arbitrarily picked) that has been used is

[Mathematical Expression Omitted] (6)

The data is iterated for 10 (an arbitrarily picked number) times.

The multiplication of the encrypted data by a constant is just like an ordinary number multiplication which consists of operations multiply, shift and add; consider a simple example of decimal multiplication, say, 34 x 12; we obtain the product in several steps:

1. multiply 34 by 1 to obtain 34;

2. shift-left the product 34 one digit to obtain the partial product 340;

3. multiply 34 by 2 to obtain 68;

4. add the partial products 340 and 68 to obtain the final product 408.

To multiply encrypted data, we do exactly the same thing except that we have to use rotate-left instead of shift-left because cyclic boundary conditions have been used (the shift-left operation would cause the loss of leftmost digits); in the example, the four leftmost digits of the plaintext and background are 0; when an operation "rotate-left" is performed, a '0' is then rotated into the rightmost position; this is equivalent to the shift-left" operation which introduces a '0' in the rightmost position in the above example of decimal multiplication. We should also note that the encryption operation and digit-rotation commute with each other. In other words, digit-rotation is compatible with the cyclic boundary condition. Therefore, we will obtain the same result whether we perform the encryption first--then rotate, or we rotate the digits--then encrypt. Thus, the multiplication of the encrypted data by 12,689 is performed in several steps:

1 .Multiply each digit of the encrypted data by 1 and rotate the product left by four digits.

2. Multiply each digit of the encrypted data by 2 and rotate the product left by three digits.

3. Multiply each digit of the encrypted data by 6 and rotate the product left by two digits.

4. Multiply each digit of the encrypted data by 8 and rotate the product left by one digit.

5. Multiply each digit of the encrypted data by 9.

6. Add the results obtained in steps 1-5; the sum is the encrypted product.

The encrypted product obtained in step 6 can be decrypted by reversing the process of iterating equation (5). Of course, the arithmetic is performed in modulo 256; the adjustment required in order to express the decrypted product in decimal form is straightforward.

Here, the sum of the digits of the multiplier 12,689 is 26 which, when multiplied by 9 (the largest decimal digit), will give a value smaller than 256 and thus will not cause overflow. In case the sum is larger than or equal to 29 (which is the smallest integer when multiplied by 9 to obtain a value larger than 256), one has to divide the multiplier into two or more groups of digits--each group will contribute to a sum less than 29; or one can choose a larger value for the number of possible states k. For example, if k is chosen to be 2[.sup.16] (a 16-bit wordsize), one can have a multiplier that has more than one thousand decimal digits without causing an overflow in the multiplication; such a wide range covers all practical problems.

In this example, we want to add the bonus earned by each of the employees of a company to his or her annual salary. A plaintext consists of two employees' salaries in dollars; the length of each text is 14 characters (including blanks and decimal digits); data formed from the salaries of two other employees serve the former's background. Therefore, the data of four employees are processed at the same time. Bonuses have to be added to salaries. The function key that we have used (arbitrarily picked) in encryption and decryption is

[Mathematical Expression Omitted] (7)

Again the data is iterated for 10 times and the ASCII codes are shifted by 48. The data are added together in encrypted form; the sum is decrypted by reversing iterating the encrypted values ten times using the same function shown in (7).

Code Breaking

In order for a model to be used safely to encrypt data, we have to study how or under what conditions it can be broken. Although homomorphic functions are susceptible to code breaking [2], we need them for superimposing encrypted data. As an example, let us consider the following simple case on how the key can be broken by solving a set of linear equations.

Consider the case in which the number of states is k = 4, and the number of elements utilized to construct the f is L = 2. The time-reversal equation becomes

[Mathematical Expression Omitted] (8)

Suppose we express f in the form

[Mathematical Expression Omitted] (9)

Since k = 4, possible values of a[.sub.0], [.sub.a]1, a[.sub.2] = 0, 1, 2, 3; there are k[.sup.L + 1] = 4[.sup.3] = 64 such functions. Consider some matrix elements of three successive time steps:
```x[.sub.i](t - 1) 0 0 0 0 0 0 0 0                (10a)
x[.sub.i](t)     0 1 1 2 0 2 1 3                (10b)
x[.sub.i](t + 1) 2 0 1 1 3 2 2 3                (10c)
```

The elements of x[.sub.i](t - 1) are purposely set to 0 for convenience of discussion. By picking up some elements in (10) and substituting them into equations (8) and (9), we obtain the following simultaneous equations. From x[.sub.1], x[.sub.2] and x[.sub.5] we have

2 = a[.sub.0] + a[.sub.2] mod 4. (11a)

0 = a[.sub.0] + a[.sub.1] + a[.sub.2] mod 4. (11b)

3 = a[.sub.0] + 2a[.sub.2] mod 4. (11c)

Solving the simultaneous equations (11), we obtain a[.sub.0] = 1, a[.sub.1]= 2 and a[.sub.2] = 1, and thus the key f is determined. From this example, one can see that code breaking is relatively easy to accomplish by solving a set of simultaneous linear equations of order L + 1 if x[.sub.i](t - 1) is known. Of course, in our model x[.sub.i](t - 1) is confidential; only two consecutive encrypted texts are known to the public. However, x[.sub.i](t - 1) can be our plaintext and may be known by other methods. For example, the first plaintext to be sent to a computer user may consist of "Please Login"; or sometimes the plaintext is just the input data of a bank depositor who is manipulating his or her bank account. Therefore, to avoid the guessing of x[.sub.i](t - 1) correctly, it is advisable to iterate equation (1) several times (note that the total number of iterations performed in the encryption is also confidential). If x[.sub.i](t - 1) is unknown, the above method is no longer applicable. To make the code breaking difficult, one can choose a reasonably large k and L.

Code Breaking by Data Subtraction

One common technique employed by cryptographers to break a cipher key is to subtract one ciphertext from another. We illustrate this technique with an example given by [6]. The Vernam cipher generates a ciphertext bit stream by:

c[.sub.i] = k[.sub.i] - mi mod 2

(12)

where M = m[.sub.1]m[.sub.2] ... denotes a plaintext bit stream; K = k[.sub.1]k[.sub.2]....a key bit stream and C = c[.sub.1]c[.sub.2] denotes the ciphertext bit stream. (Note that in mod 2 arithmetic, addition and subtraction are equivalent and are both equal to the binary exclusive OR operation.) Suppose another plaintext bit stream M' is enciphered in the same way:

c[.sub.i]' = k[.sub.i] - m[.sub.i]' mod 2

(13)

By subtracting (13) from (12), we obtain

c[.sub.i]" = c[.sub.i] - c[.sub.i]' = m[.sub.i]' - m[.sub.i] mod 2 (14)

The stream C" is therefore equivalent to a stream generated by the encipherment of message M with message (key) M' [6] and may be broken if the messages are partially known; the key could then be obtained by k[.sub.i] = c[.sub.i] + m[.sub.i] (mod 2).

Our model is safe under this attack for the following reasons. Such a technique when applied to the current model would correspond to subtracting (3b) from (3a):

[Mathematical Expression Omitted] (15)

First, the term (f[{x[.sub.i](t)}] - f[{y[.sub.i](t)}) in (15) does not vanish in general. Even if it vanishes under some very special cases, unlike the Vernam cipher, y[.sub.i](t - 1) and x[.sub.i](t - 1) are not plaintexts as long as several iterations have been performed. Therefore, to secure the code from being broken by data subtraction, it is again better to iterate equation (1) several times to obtain the ciphertexts.

From these discussions, we can see the important role played by the arbitrary iterations. The iterations insulate a ciphertext from the corresponding plaintext. This renders the standard code breaking by subtraction technique inapplicable here.

The Arbitrary Background

If one still worries that there may exist some currently unknown method which can break the key by knowing part of the plaintexts, one can further secure the system by making use of the arbitrary background. Each time a plaintext is enciphered, a random background is used. Now each ciphertext consists of some random elements and one cannot obtain the key of something which is random; on the other hand, one cannot decompose the random elements without knowing the key. Thus interception of the ciphertext may not reveal anything new about the plaintext message. We should note that unlike the one-time pad model [6], the receiver does not need to have any knowledge about the random backgrounds in order to decipher the ciphertexts. Thus, the random backgrounds can be generated by any convenient and secure method.

As mentioned by [11], no technique exists to prove an encryption scheme is secure; the only test available is to see whether anyone can think of a way to break it. At the moment, we have not found a simple way to break the model. The security of the model needs to be examined in further detail. Any interested reader is urged to find a way to break the model. If the method can withstand various attacks for a sufficient length of time, it may be used with a certain amount of confidence.

Discussions

From these two examples, one can see that superimposing encrypted data based upon time-reversal transformations can be very easily implemented in solving practical problems. Due to the abundant choice of functions, the method may have the potential of acquiring high security--after a certain number of iterations, information from one region diffuses into another region and gets mixed up there. This is similar to the case where none can tell the original distribution of the color from a colored mixture [12]. This property is also the main difference between the current model and many of the traditional encryption models; the current model mixes both time and spatial correlations of the information; this phenomenon may be regarded as diffusion [6]. It makes discerning meaningful patterns from ciphertexts a complicated and difficult process; the phenomenon also makes it difficult to obtain significant information of the plaintexts by analysing the symbol occurrence frequency of the ciphertexts.

The fact that two ciphertexts can be superimposed while each retains its original pattern is analogous to the superposition of waves. The music produced by a piano, when superimposed with the sound produced by a singer, forms a wave; our ears are capable of resolving the components. When several laser beams of different colors meet, a new color is formed at the intersecting region; yet when they pass through each other, the original color of each beam is preserved. The coincidence of the similarities between the superposition of the encrypted data and the waves is not surprising when one recognizes that a traveling wave W(x,t) is described by the following equation:

[Mathematical Expression Omitted] (16)

where c = velocity of light. Upon discretizing equation (16) with W -> x, x -> i, t -> t, one can immediately see that the resulting discretized equation is in exactly the same form as the encryption equation (1). (In this example, f[{x[.sub.i]}] = c[.sup.2](x[.sup.1 + 1] + x[.sub.i- 1]) + 2(1 - c[.sup.2])x[.sub.i].) It is well known that any linear combination of the solutions to (16) is also a solution to (16).

The work presented here does not contradict the general theorems about encrypted-data computation discussed by [1]. Abadi et al. showed that if a function g(x) is computable in the expected polynomial time, then g(x) is encryptable hiding x. That is, one can always find an encryption scheme for which one cannot infer anything about x when it is processed in encrypted form-provided the "dataprocessing" is computable in expected polynomial time. In our model, the g(x) involves only simple multiplication and addition (i.e., superimposing data) which is computable in expected polynomial time. Hence it is reasonable that a safe encryption scheme exists.

The scheme presented here may be particularly useful in banking security. Most bank transactions only involve numerical addition and multiplication. While a banking system wants to protect its security from individuals, an individual also wants to protect his or her own interest from the bank [5]. If a customer's account is processed in encrypted form, it is possible to hide certain desired information from the bank [1, 3, 4, 5].

In closing, we would like to mention that the model presented can be further analyzed with algebraic techniques. To perform theoretical analysis, it may be better to analyse the model in Galois Fields, say, GF(2[.sup.8]); some properties related to the model have been obtained by Martin et al. [8].

Conclusions

We have discussed the possibilities of applying time-reversal transformations to superimpose data in encrypted forms; it turns out that the method is simple and straightforward to implement. The abundance of choice of the keys, the arbitrary number of iterations and the property of arbitrary backgrounds may make the operations safe. Two practical examples have been presented to highlight the model. The presentation here may shed light on the study of processing encrypted data.

References

1. Abadi, M., Feigenbaum, J., and Kilian, J. On hiding information from an oracle. J. Comput. Syst. Sci. 39 (1989), 21-50.

2. Ahitub, N., Lapid, Y., and Neumann, S. Processing encrypted data. Commun. ACM 30, 9 (Sept. 1987), 777.

3. Bennett, C.H., Brassard, G., and Robert, J. Privacy amplification by public discussion. SIAM J. Comput. 17 (Apr. 1988), 210-229.

4. Brassard, G., Chaum, D., and Crepeau, C. Minimum disclosure proofs of knowledge. J. Comput. Syst. Sci. 37 (1988), 156-189.

5. Chaum, D. Security without identification: Transaction systems to make Big Brother obsolete. Commun. ACM 28 (Oct. 1985), 1030-1044.

7. Denning, D.E., and Denning, P.J. Data security. ACM Comput. Surv. 11, 3 (Sept. 1979), 227.

8. Martin, O., Odlyzko, A.M., and Wolfram, S. Algebraic properties of cellular automata. Commun. Math. Phys. 93 (1984), 219-258.

9. Popek, G.J., and Kline, C.S. Encryption and secure computer networks. ACM Comput. Surv. 11, 4 (Dec. 1979), 331.

10. Rivest, R.L., Adleman, L., and Dertouzos, M.L. On data banks and privacy homomorphisms. In Foundations of Secure Computations, R.A. DeMillo, D.P. Dobkin, A.K. Jones, and R.J. Lipton, Eds., Academic Press, N.Y., 1978, 169-179.

11. Rivest, R.L., Shamir, A., and Adleman, L. A method for obtaining digital signatures and public key cryptosystems. Commun. ACM 21 (Feb. 1978), 120-126.

12. Yu, K.W., and Yu, T.L. Data encryption based upon time reversal transformations. Comput. J. 32, 3 (june 1989), 241-245.

CR Categories and Subject Descriptors: E.3 [Data]: Data Encryption-data encryption standard (DES); public key cryptosystems

General Terms: Algorithms, Security

Additional Key Words and Phrases: Bank transactions, cryptography, data security, privacy, time-reversal transformations.

K.W. YU is a lecturer in the Department of Applied Physics at Hong Kong Polytechnic, Hung Hom, Kowloon, Hong Kong. His research concentrates on information science, computational physics and condensed matter physics. Author's Present Address: Department of Physics, The Chinese University of Hong Kong, Shatin, Hong Kong, China.

TONG LAI YU is an assistant professor of Computer Science at California State, San Bernardino and a technical consultant to Ah Shui Neurocomputing Inc. His current research interests include neural network theories and applications, parallel computing, pattern recognition, speech processing, data encoding, computer hardware education and Chinese language processing. Author's Present Address: Department of Computer Science, California State University, 5500 University Parkway, San Bernardino, CA 92407. ptongyu@calstate
COPYRIGHT 1991 Association for Computing Machinery, Inc.
No portion of this article can be reproduced without the express written permission from the copyright holder.