Merkle–Hellman knapsack cryptosystem

The Merkle–Hellman knapsack cryptosystem was one of the earliest public key cryptosystems. It was published by Ralph Merkle and Martin Hellman in 1978. A polynomial time attack was published by Adi Shamir in 1984. As a result, the cryptosystem is now considered insecure.

History
The concept of public key cryptography was introduced by Whitfield Diffie and Martin Hellman in 1976. At that time they proposed the general concept of a "trap-door one-way function", a function whose inverse is computationally infeasible to calculate without some secret "trap-door information"; but they had not yet found a practical example of such a function. Several specific public-key cryptosystems were then proposed by other researchers over the next few years, such as RSA in 1977 and Merkle-Hellman in 1978.

Description
Merkle–Hellman is a public key cryptosystem, meaning that two keys are used, a public key for encryption and a private key for decryption. It is based on the subset sum problem (a special case of the knapsack problem). The problem is as follows: given a set of integers $$A$$ and an integer $$c$$, find a subset of $$A$$ which sums to $$c$$. In general, this problem is known to be NP-complete. However, if $$A$$ is superincreasing, meaning that each element of the set is greater than the sum of all the numbers in the set lesser than it, the problem is "easy" and solvable in polynomial time with a simple greedy algorithm.

In Merkle–Hellman, decrypting a message requires solving an apparently "hard" knapsack problem. The private key contains a superincreasing list of numbers $$W$$, and the public key contains a non-superincreasing list of numbers $$B$$, which is actually a "disguised" version of $$W$$. The private key also contains some "trapdoor" information that can be used to transform a hard knapsack problem using $$B$$ into an easy knapsack problem using $$W$$.

Unlike some other public key cryptosystems such as RSA, the two keys in Merkle-Hellman are not interchangeable; the private key cannot be used for encryption. Thus Merkle-Hellman is not directly usable for authentication by cryptographic signing, although Shamir published a variant that can be used for signing.

Key generation
1. Choose a block size $$n$$. Integers up to $$n$$ bits in length can be encrypted with this key.

2. Choose a random superincreasing sequence of $$n$$ positive integers
 * $$W = ( w_1, w_2, \dots, w_n )$$
 * The superincreasing requirement means that $$w_k > \sum_{i = 1}^{k-1} w_i$$, for $$1 < k \le n$$.

3. Choose a random integer $$q$$ such that
 * $$q > \sum_{i = 1}^n w_i$$

4. Choose a random integer $$r$$ such that $$\gcd(r,q) = 1$$ (that is, $$r$$ and $$q$$ are coprime).

5. Calculate the sequence
 * $$B = ( b_1, b_2, \dots, b_n )$$
 * where $$b_i = r w_i \bmod q$$.

The public key is $$B$$ and the private key is $$(W,q,r)$$.

Encryption
Let $$m$$ be an $$n$$-bit message consisting of bits $$m_1 m_2 \dots m_n$$, with $$m_1$$ the highest order bit. Select each $$b_i$$ for which $$m_i$$ is nonzero, and add them together. Equivalently, calculate
 * $$c = \sum_{i = 1}^n m_i b_i$$.

The ciphertext is $$c$$.

Decryption
To decrypt a ciphertext $$c$$, we must find the subset of $$B$$ which sums to $$c$$. We do this by transforming the problem into one of finding a subset of $$W$$. That problem can be solved in polynomial time since $$W$$ is superincreasing.

1. Calculate the modular inverse of $$r$$ modulo $$q$$ using the Extended Euclidean algorithm. The inverse will exist since $$r$$ is coprime to $$q$$.
 * $$r' := r^{-1} \pmod q$$
 * The computation of $$r'$$ is independent of the message, and can be done just once when the private key is generated.

2. Calculate
 * $$c' := c r' \bmod q$$

3. Solve the subset sum problem for $$c'$$ using the superincreasing sequence $$W$$, by the simple greedy algorithm described below. Let $$X = (x_1, x_2, \dots, x_k)$$ be the resulting list of indexes of the elements of $$W$$ which sum to $$c'$$. (That is, $$c' = \sum_{i=1}^k w_{x_i}$$.)

4. Construct the message $$m$$ with a 1 in each $$x_i$$ bit position and a 0 in all other bit positions:
 * $$m = \sum_{i=1}^k 2^{n-x_i}$$

Solving the subset sum problem
This simple greedy algorithm finds the subset of a superincreasing sequence $$W$$ which sums to $$c'$$, in polynomial time:


 * 1. Initialize $$X$$ to an empty list.


 * 2. Find the largest element in $$W$$ which is less than or equal to $$c'$$, say $$w_j$$.


 * 3. Subtract: $$c' := c' - w_j$$.


 * 4. Append $$j$$ to the list $$X$$.


 * 5. If $$c'$$ is greater than zero, return to step 2.

Key generation
Create a key to encrypt 8-bit numbers by creating a random superincreasing sequence of 8 values:
 * $$W = ( 2, 7, 11, 21, 42, 89, 180, 354 )$$

The sum of these is 706, so select a larger value for $$q$$:
 * $$q = 881$$.

Choose $$r$$ to be coprime to $$q$$:
 * $$r = 588$$.

Construct the public key $$B$$ by multiplying each element in $$W$$ by $$r$$ modulo $$q$$:


 * $$\begin{align}

&(2 * 588) \bmod 881 = 295 \\ &(7 * 588) \bmod 881 = 592 \\ &(11 * 588) \bmod 881 = 301 \\ &(21 * 588) \bmod 881 = 14 \\ &(42 * 588) \bmod 881 = 28 \\ &(89 * 588) \bmod 881 = 353 \\ &(180 * 588) \bmod 881 = 120 \\ &(354 * 588) \bmod 881 = 236 \end{align}$$

Hence $$B = ( 295, 592, 301, 14, 28, 353, 120, 236 )$$.

Encryption
Let the 8-bit message be $$m = 97 = 01100001_2$$. We multiply each bit by the corresponding number in $$B$$ and add the results: 0 * 295 + 1 * 592 + 1 * 301 + 0 * 14 + 0 * 28 + 0 * 353 + 0 * 120 + 1 * 236    = 1129 The ciphertext $$c$$ is 1129.

Decryption
To decrypt 1129, first use the Extended Euclidean Algorithm to find the modular inverse of $$r$$ mod $$q$$:
 * $$r' = r^{-1} \bmod q = 588^{-1} \bmod 881 = 442$$.

Compute $$c' = c r' \bmod q = 1129*442 \bmod 881 = 372$$.

Use the greedy algorithm to decompose 372 into a sum of $$w_i$$ values:
 * $$\begin{align}

c' &= 372 \\ & w_8 = 354 \le 372 \\ c' &= 372-354 = 18 \\ & w_3 = 11 \le 18 \\ c' &= 18-11 = 7 \\ & w_2 = 7 \le 7 \\ c' &= 7-7 = 0 \end{align}$$ Thus $$372 = 354 + 11 + 7 = w_8 + w_3 + w_2$$, and the list of indexes is $$X = (8,3,2)$$. The message can now be computed as
 * $$m = \sum_{i=1}^3 2^{n-x_i} = 2^{8-8} + 2^{8-3} + 2^{8-2} = 1 + 32 + 64 = 97$$.

Cryptanalysis
In 1984 Adi Shamir published an attack on the Merkle-Hellman cryptosystem which can decrypt encrypted messages in polynomial time without using the private key. The attack analyzes the public key $$B = (b_1, b_2, \dots, b_n)$$ and searches for a pair of numbers $$u$$ and $$m$$ such that $$(u b_i \bmod m)$$ is a superincreasing sequence. The $$(u,m)$$ pair found by the attack may not be equal to $$(r',q)$$ in the private key, but like that pair it can be used to transform a hard knapsack problem using $$B$$ into an easy problem using a superincreasing sequence. The attack operates solely on the public key; no access to encrypted messages is necessary.

Shamir's attack on the Merkle-Hellman cryptosystem works in polynomial time even if the numbers in the public key are randomly shuffled, a step which is usually not included in the description of the cryptosystem, but can be helpful against some more primitive attacks.