Autokey cipher

An autokey cipher (also known as the autoclave cipher) is a cipher that incorporates the message (the plaintext) into the key. The key is generated from the message in some automated fashion, sometimes by selecting certain letters from the text or, more commonly, by adding a short primer key to the front of the message.

There are two forms of autokey cipher: key-autokey and text-autokey ciphers. A key-autokey cipher uses previous members of the keystream to determine the next element in the keystream. A text-autokey uses the previous message text to determine the next element in the keystream.

History
This cipher was invented in 1586 by Blaise de Vigenère with a reciprocal table of ten alphabets. Vigenère's version used an agreed-upon letter of the alphabet as a primer, making the key by writing down that letter and then the rest of the message.

More popular autokeys use a tabula recta, a square with 26 copies of the alphabet, the first line starting with 'A', the next line starting with 'B' etc. Instead of a single letter, a short agreed-upon keyword is used, and the key is generated by writing down the primer and then the rest of the message, as in Vigenère's version. To encrypt a plaintext, the row with the first letter of the message and the column with the first letter of the key are located. The letter in which the row and the column cross is the ciphertext letter.

Method
The autokey cipher, as used by members of the American Cryptogram Association, starts with a relatively-short keyword, the primer, and appends the message to it. For example, if the keyword is QUEENLY and the message is attack at dawn, then the key would be QUEENLYATTACKATDAWN.

Plaintext: attackatdawn Key:       QUEENLYATTACKATDAWN Ciphertext: QNXEPVYTWTWP

The ciphertext message would thus be "QNXEPVYTWTWP".

To decrypt the message, the recipient would start by writing down the agreed-upon keyword.

QNXEPVYTWTWP QUEENLY

The first letter of the key, Q, would then be taken, and that row would be found in a tabula recta. That column for the first letter of the ciphertext would be looked across, also Q in this case, and the letter to the top would be retrieved, A. Now, that letter would be added to the end of the key:

QNXEPVYTWTWP QUEENLYA a

Then, since the next letter in the key is U and the next letter in the ciphertext is N, the U row is looked across to find the N to retrieve T:

QNXEPVYTWTWP QUEENLYAT at

That continues until the entire key is reconstructed, when the primer can be removed from the start.

With Vigenère's autokey cipher, a single mistake in encryption renders the rest of the message unintelligible.

Cryptanalysis
Autokey ciphers are somewhat more secure than polyalphabetic ciphers that use fixed keys since the key does not repeat within a single message. Therefore, methods like the Kasiski examination or index of coincidence analysis will not work on the ciphertext, unlike for similar ciphers that use a single repeated key.

A crucial weakness of the system, however, is that the plaintext is part of the key. That means that the key will likely contain common words at various points. The key can be attacked by using a dictionary of common words, bigrams, trigrams etc. and by attempting the decryption of the message by moving that word through the key until potentially-readable text appears.

Consider an example message  encrypted with the primer keyword  : To start, the autokey would be constructed by placing the primer at the front of the message:

plaintext: meetatthefountain primer:    KILT autokey:   KILTMEETATTHEFOUN

The message is then encrypted by using the key and the substitution alphabets, here a tabula recta:

plaintext: meetatthefountain key:       KILTMEETATTHEFOUN ciphertext: WMPMMXXAEYHBRYOCA

The attacker receives only the ciphertext and can attack the text by selecting a word that is likely to appear in the plaintext. In this example, the attacker selects the word  as a potential part of the original message and then attempts to decode it by placing   at every possible location in the key:

ciphertext: WMP MMX XAE YHB RYO CA key:        THE THE THE THE THE .. plaintext: dfl tft eta fax yrk .. ciphertext: W MPM MXX AEY HBR YOC A key:. THE THE THE THE THE. plaintext:. tii tqt hxu oun fhy. ciphertext: WM PMM XXA EYH BRY OCA key:       .. THE THE THE THE THE plaintext: .. wfi eqw lrd iku vvw

In each case, the resulting plaintext appears almost random because the key is not aligned for most of the ciphertext. However, examining the results can suggest locations of the key being properly aligned. In those cases, the resulting decrypted text is potentially part of a word. In this example, it is highly unlikely that  is the start of the original plaintext and so it is highly unlikely either that the first three letters of the key are. Examining the results, a number of fragments that are possibly words can be seen and others can be eliminated. Then, the plaintext fragments can be sorted in their order of likelihood:

unlikely ←——————————————————→ promising eqw dfl tqt ... ... ... ... eta oun fax

A correct plaintext fragment is also going to appear in the key, shifted right by the length of the keyword. Similarly, the guessed key fragment also appears in the plaintext shifted left. Thus, by guessing keyword lengths (probably between 3 and 12), more plaintext and key can be revealed.

Trying that with, possibly after wasting some time with the others, results in the following:

shift by 4: ciphertext: WMPMMXXAEYHBRYOCA key:       ......ETA.THE.OUN plaintext: ......the.oun.ain

shift by 5: ciphertext: WMPMMXXAEYHBRYOCA key:       .....EQW..THE..OU plaintext:  .....the..oun..og

shift by 6: ciphertext: WMPMMXXAEYHBRYOCA key:       ....TQT...THE...O plaintext:  ....the...oun...m

A shift of 4 can be seen to look good (both of the others have unlikely Qs) and so the revealed  can be shifted back by 4 into the plaintext:

ciphertext: WMPMMXXAEYHBRYOCA key:       ..LTM.ETA.THE.OUN plaintext: ..eta.the.oun.ain

A lot can be worked with now. The keyword is probably 4 characters long, and some of the message is visible:

m.eta.the.oun.ain

Because the plaintext guesses have an effect on the key 4 characters to the left, feedback on correct and incorrect guesses is given. The gaps can quickly be filled in:

meetatthefountain

The ease of cryptanalysis is caused by the feedback from the relationship between plaintext and key. A three-character guess reveals six more characters (three on each side), which then reveal further characters, creating a cascade effect. That allows incorrect guesses to be ruled out quickly.