Ciphertext indistinguishability

Ciphertext indistinguishability is a property of many encryption schemes. Intuitively, if a cryptosystem possesses the property of indistinguishability, then an adversary will be unable to distinguish pairs of ciphertexts based on the message they encrypt. The property of indistinguishability under chosen plaintext attack is considered a basic requirement for most provably secure public key cryptosystems, though some schemes also provide indistinguishability under chosen ciphertext attack and adaptive chosen ciphertext attack. Indistinguishability under chosen plaintext attack is equivalent to the property of semantic security, and many cryptographic proofs use these definitions interchangeably.

A cryptosystem is considered secure in terms of indistinguishability if no adversary, given an encryption of a message randomly chosen from a two-element message space determined by the adversary, can identify the message choice with probability significantly better than that of random guessing ($1/2$). If any adversary can succeed in distinguishing the chosen ciphertext with a probability significantly greater than $1/2$, then this adversary is considered to have an "advantage" in distinguishing the ciphertext, and the scheme is not considered secure in terms of indistinguishability. This definition encompasses the notion that in a secure scheme, the adversary should learn no information from seeing a ciphertext. Therefore, the adversary should be able to do no better than if it guessed randomly.

Formal definitions
Security in terms of indistinguishability has many definitions, depending on assumptions made about the capabilities of the attacker. It is normally presented as a game, where the cryptosystem is considered secure if no adversary can win the game with significantly greater probability than an adversary who must guess randomly. The most common definitions used in cryptography are indistinguishability under chosen plaintext attack (abbreviated IND-CPA), indistinguishability under (non-adaptive) chosen ciphertext attack (IND-CCA1), and indistinguishability under adaptive chosen ciphertext attack (IND-CCA2). Security under either of the latter definition implies security under the previous ones: a scheme which is IND-CCA1 secure is also IND-CPA secure, and a scheme which is IND-CCA2 secure is both IND-CCA1 and IND-CPA secure. Thus, IND-CCA2 is the strongest of the three definitions of security.

Indistinguishability under chosen-plaintext attack (IND-CPA)
For a probabilistic asymmetric key encryption algorithm, indistinguishability under chosen plaintext attack (IND-CPA) is defined by the following game between an adversary and a challenger. For schemes based on computational security, the adversary is modeled by a probabilistic polynomial time Turing machine, meaning that it must complete the game and output a guess within a polynomial number of time steps. In this definition E(PK, M) represents the encryption of a message M under the key PK:


 * 1) The challenger generates a key pair PK, SK based on some security parameter k (e.g., a key size in bits), and publishes PK to the adversary. The challenger retains SK.
 * 2) The adversary may perform a polynomially bounded number of encryptions or other operations.
 * 3) Eventually, the adversary submits two distinct chosen plaintexts $$\scriptstyle M_0, M_1$$ to the challenger.
 * 4) The challenger selects a bit b $$\scriptstyle \in$$ {0, 1} uniformly at random, and sends the challenge ciphertext C = E(PK, $$\scriptstyle M_b$$) back to the adversary.
 * 5) The adversary is free to perform any number of additional computations or encryptions.
 * 6) Finally, the adversary outputs a guess for the value of b.

A cryptosystem is indistinguishable under chosen plaintext attack if every probabilistic polynomial time adversary has only a negligible "advantage" over random guessing. An adversary is said to have a negligible "advantage" if it wins the above game with probability $$\scriptstyle \left(\frac{1}{2}\right) \,+\, \epsilon(k)$$, where $$\scriptstyle \epsilon(k)$$ is a negligible function in the security parameter k, that is for every (nonzero) polynomial function $$\scriptstyle poly$$ there exists $$\scriptstyle k_0$$ such that $$\scriptstyle |\epsilon(k)| \;<\; \left|\frac{1}{poly(k)}\right|$$ for all $$\scriptstyle k \;>\; k_0$$.

Although the adversary knows $$\scriptstyle M_0$$, $$\scriptstyle M_1$$ and PK, the probabilistic nature of E means that the encryption of $$\scriptstyle M_b$$ will be only one of many valid ciphertexts, and therefore encrypting $$\scriptstyle M_0$$, $$\scriptstyle M_1$$ and comparing the resulting ciphertexts with the challenge ciphertext does not afford any non-negligible advantage to the adversary.

While the above definition is specific to an asymmetric key cryptosystem, it can be adapted to the symmetric case by replacing the public key encryption function with an encryption oracle, which retains the secret encryption key and encrypts arbitrary plaintexts at the adversary's request.

Symmetric IND-CPA Game, Formalized
The adversarial process of performing a chosen-plaintext attack is usually outlined in the form of a Cryptographic Game. To test for symmetric IND-CPA, the game described above is defined. Let $$ \mathcal{K} $$ be a key generation function, $$ \mathcal{E} $$ be an encryption function, and $$ \mathcal{D} $$ be a decryption function. Let $$ \mathcal{S}\mathcal{E} = (\mathcal{K}, \mathcal{E}, \mathcal{D}) $$ be a symmetric encryption scheme. The game $$Guess$$ is defined as:



As many times as it would like, an adversary selects two plaintext messages of its own choosing and provides them to the LR oracle which returns a ciphertext encrypting one of the messages. An adversary's advantage is determined by its probability of guessing the value of b, a value chosen at random at the beginning of the game which determines the message that is encrypted in the LR oracle. Therefore, its advantage is defined as: $$\operatorname{Adv}_{\mathcal{SE}}^{\mathrm{ind-cpa}}(A) = 2 \cdot \Pr\left[ \mathrm{Guess}_{\mathcal{SE}}^A \Rightarrow \mathrm{true} \right] - 1$$

Indistinguishability under chosen ciphertext attack/adaptive chosen ciphertext attack (IND-CCA1, IND-CCA2)
Indistinguishability under non-adaptive and adaptive Chosen Ciphertext Attack (IND-CCA1, IND-CCA2) uses a definition similar to that of IND-CPA. However, in addition to the public key (or encryption oracle, in the symmetric case), the adversary is given access to a decryption oracle which decrypts arbitrary ciphertexts at the adversary's request, returning the plaintext. In the non-adaptive definition, the adversary is allowed to query this oracle only up until it receives the challenge ciphertext. In the adaptive definition, the adversary may continue to query the decryption oracle even after it has received a challenge ciphertext, with the caveat that it may not pass the challenge ciphertext for decryption (otherwise, the definition would be trivial).


 * 1) The challenger generates a key pair PK, SK based on some security parameter k (e.g., a key size in bits), and publishes PK to the adversary. The challenger retains SK.
 * 2) The adversary may perform any number of calls to the encryptions and decryption oracle based on arbitrary ciphertexts, or other operations.
 * 3) Eventually, the adversary submits two distinct chosen plaintexts $$\scriptstyle M_0,\, M_1$$ to the challenger.
 * 4) The challenger selects a bit b ∈ {0, 1} uniformly at random, and sends the "challenge" ciphertext C = E(PK, $$\scriptstyle M_b$$) back to the adversary.
 * 5) The adversary is free to perform any number of additional computations or encryptions.
 * 6) In the non-adaptive case (IND-CCA1), the adversary may not make further calls to the decryption oracle.
 * 7) In the adaptive case (IND-CCA2), the adversary may make further calls to the decryption oracle, but may not submit the challenge ciphertext C.
 * 8) Finally, the adversary outputs a guess for the value of b.

A scheme is IND-CCA1/IND-CCA2 secure if no adversary has a non-negligible advantage in winning the above game.

Indistinguishable from random noise
Sometimes we need encryption schemes in which the ciphertext string is indistinguishable from a random string by the adversary.

If an adversary is unable to tell if a message even exists, it gives the person who wrote the message plausible deniability.

Some people building encrypted communication links prefer to make the contents of each encrypted datagram indistinguishable from random data, in order to make traffic analysis more difficult.

Some people building systems to store encrypted data prefer to make the data indistinguishable from random data in order to make data hiding easier. For example, some kinds of disk encryption such as TrueCrypt attempt to hide data in the innocent random data left over from some kinds of data erasure. As another example, some kinds of steganography attempt to hide data by making it match the statistical characteristics of the innocent "random" image noise in digital photos.

To support such deniable encryption systems, a few cryptographic algorithms are specifically designed to make ciphertext messages indistinguishable from random bit strings.

Most applications don't require an encryption algorithm to produce encrypted messages that are indistinguishable from random bits. However, some authors consider such encryption algorithms to be conceptually simpler and easier to work with, and more versatile in practice—and most IND-CPA encryption algorithms apparently do, in fact, produce encrypted messages that are indistinguishable from random bits.

Equivalences and implications
Indistinguishability is an important property for maintaining the confidentiality of encrypted communications. However, the property of indistinguishability has in some cases been found to imply other, apparently unrelated security properties. Sometimes these implications go in both directions, making two definitions equivalent; for example, it is known that the property of indistinguishability under adaptive chosen ciphertext attack (IND-CCA2) is equivalent to the property of non-malleability under the same attack scenario (NM-CCA2). This equivalence is not immediately obvious, as non-malleability is a property dealing with message integrity, rather than confidentiality. In other cases, it has been demonstrated that indistinguishability can be combined with certain other definitions, in order to imply still other useful definitions, and vice versa. The following list summarizes a few known implications, though it is by no means complete.

The notation $$\scriptstyle A \;\Rightarrow\; B$$ means that property A implies property B. $$\scriptstyle A \;\Leftrightarrow\; B$$ means that properties A and B are equivalent. $$\scriptstyle A \;\not \Rightarrow\; B$$ means that property A does not necessarily imply property B.


 * IND-CPA $$\scriptstyle \Leftrightarrow$$ semantic security under CPA.
 * NM-CPA (non-malleability under chosen plaintext attack) $$\scriptstyle \Rightarrow$$ IND-CPA.
 * NM-CPA (non-malleability under chosen plaintext attack) $$\scriptstyle \not \Rightarrow$$ IND-CCA2.
 * NM-CCA2 (non-malleability under adaptive chosen ciphertext attack) $$\scriptstyle \Leftrightarrow$$ IND-CCA2.