Cryptanalysis

From Wikipedia, the free encyclopedia
Reconstruction of the appearance of cyclometer, a device used to break the encryption of the Enigma machine. Based on sketches in Marian Rejewski's memoirs

Cryptanalysis (from the Greek kryptós, "hidden", and analýein, "to analyze") refers to the process of analyzing information systems in order to understand hidden aspects of the systems.[1] Cryptanalysis is used to breach cryptographic security systems and gain access to the contents of encrypted messages, even if the cryptographic key is unknown.

In addition to mathematical analysis of cryptographic algorithms, cryptanalysis includes the study of side-channel attacks that do not target weaknesses in the cryptographic algorithms themselves, but instead exploit weaknesses in their implementation.

Even though the goal has been the same, the methods and techniques of cryptanalysis have changed drastically through the history of cryptography, adapting to increasing cryptographic complexity, ranging from the pen-and-paper methods of the past, through machines like the British Bombes and Colossus computers at Bletchley Park in World War II, to the mathematically advanced computerized schemes of the present. Methods for breaking modern cryptosystems often involve solving carefully constructed problems in pure mathematics, the best-known being integer factorization.

Overview[edit]

In encryption, confidential information (called the "plaintext") is sent securely to a recipient by the sender first converting it into an unreadable form ("ciphertext") using an encryption algorithm. The ciphertext is sent through an insecure channel to the recipient. The recipient decrypts the ciphertext by applying an inverse decryption algorithm, recovering the plaintext. To decrypt the ciphertext, the recipient requires a secret knowledge from the sender, usually a string of letters, numbers, or bits, called a cryptographic key. The concept is that even if an unauthorized person gets access to the ciphertext during transmission, without the secret key they cannot convert it back to plaintext.

Encryption has been used throughout history to send important military, diplomatic and commercial messages, and today is very widely used in computer networking to protect email and internet communication.

The goal of cryptanalysis is for a third party, a cryptanalyst, to gain as much information as possible about the original ("plaintext"), attempting to "break" the encryption to read the ciphertext and learning the secret key so future messages can be decrypted and read.[2] A mathematical technique to do this is called a cryptographic attack. Cryptographic attacks can be characterized in a number of ways:

Amount of information available to the attacker[edit]

Cryptanalytical attacks can be classified based on what type of information the attacker has available. As a basic starting point it is normally assumed that, for the purposes of analysis, the general algorithm is known; this is Shannon's Maxim "the enemy knows the system"[3] – in its turn, equivalent to Kerckhoffs's principle.[4] This is a reasonable assumption in practice – throughout history, there are countless examples of secret algorithms falling into wider knowledge, variously through espionage, betrayal and reverse engineering. (And on occasion, ciphers have been broken through pure deduction; for example, the German Lorenz cipher and the Japanese Purple code, and a variety of classical schemes):[5]

  • Ciphertext-only: the cryptanalyst has access only to a collection of ciphertexts or codetexts.
  • Known-plaintext: the attacker has a set of ciphertexts to which they know the corresponding plaintext.
  • Chosen-plaintext (chosen-ciphertext): the attacker can obtain the ciphertexts (plaintexts) corresponding to an arbitrary set of plaintexts (ciphertexts) of their own choosing.
  • Adaptive chosen-plaintext: like a chosen-plaintext attack, except the attacker can choose subsequent plaintexts based on information learned from previous encryptions, similarly to the Adaptive chosen ciphertext attack.
  • Related-key attack: Like a chosen-plaintext attack, except the attacker can obtain ciphertexts encrypted under two different keys. The keys are unknown, but the relationship between them is known; for example, two keys that differ in the one bit.

Computational resources required[edit]

Attacks can also be characterised by the resources they require. Those resources include:[6]

  • Time – the number of computation steps (e.g., test encryptions) which must be performed.
  • Memory – the amount of storage required to perform the attack.
  • Data – the quantity and type of plaintexts and ciphertexts required for a particular approach.

It is sometimes difficult to predict these quantities precisely, especially when the attack is not practical to actually implement for testing. But academic cryptanalysts tend to provide at least the estimated order of magnitude of their attacks' difficulty, saying, for example, "SHA-1 collisions now 252."[7]

Bruce Schneier notes that even computationally impractical attacks can be considered breaks: "Breaking a cipher simply means finding a weakness in the cipher that can be exploited with a complexity less than brute force. Never mind that brute-force might require 2128 encryptions; an attack requiring 2110 encryptions would be considered a break...simply put, a break can just be a certificational weakness: evidence that the cipher does not perform as advertised."[8]

Partial breaks[edit]

The results of cryptanalysis can also vary in usefulness. Cryptographer Lars Knudsen (1998) classified various types of attack on block ciphers according to the amount and quality of secret information that was discovered:

  • Total break – the attacker deduces the secret key.
  • Global deduction – the attacker discovers a functionally equivalent algorithm for encryption and decryption, but without learning the key.
  • Instance (local) deduction – the attacker discovers additional plaintexts (or ciphertexts) not previously known.
  • Information deduction – the attacker gains some Shannon information about plaintexts (or ciphertexts) not previously known.
  • Distinguishing algorithm – the attacker can distinguish the cipher from a random permutation.

Academic attacks are often against weakened versions of a cryptosystem, such as a block cipher or hash function with some rounds removed. Many, but not all, attacks become exponentially more difficult to execute as rounds are added to a cryptosystem,[9] so it's possible for the full cryptosystem to be strong even though reduced-round variants are weak. Nonetheless, partial breaks that come close to breaking the original cryptosystem may mean that a full break will follow; the successful attacks on DES, MD5, and SHA-1 were all preceded by attacks on weakened versions.

In academic cryptography, a weakness or a break in a scheme is usually defined quite conservatively: it might require impractical amounts of time, memory, or known plaintexts. It also might require the attacker be able to do things many real-world attackers can't: for example, the attacker may need to choose particular plaintexts to be encrypted or even to ask for plaintexts to be encrypted using several keys related to the secret key. Furthermore, it might only reveal a small amount of information, enough to prove the cryptosystem imperfect but too little to be useful to real-world attackers. Finally, an attack might only apply to a weakened version of cryptographic tools, like a reduced-round block cipher, as a step towards breaking the full system.[8]

History[edit]

Cryptanalysis has coevolved together with cryptography, and the contest can be traced through the history of cryptography—new ciphers being designed to replace old broken designs, and new cryptanalytic techniques invented to crack the improved schemes. In practice, they are viewed as two sides of the same coin: secure cryptography requires design against possible cryptanalysis.[citation needed]

Classical ciphers[edit]

First page of Al-Kindi's 9th century Manuscript on Deciphering Cryptographic Messages

Although the actual word "cryptanalysis" is relatively recent (it was coined by William Friedman in 1920), methods for breaking codes and ciphers are much older. David Kahn notes in The Codebreakers that Arab scholars were the first people to systematically document cryptanalytic methods.[10]

The first known recorded explanation of cryptanalysis was given by Al-Kindi (c. 801–873, also known as "Alkindus" in Europe), a 9th-century Arab polymath,[11][12] in Risalah fi Istikhraj al-Mu'amma (A Manuscript on Deciphering Cryptographic Messages). This treatise contains the first description of the method of frequency analysis.[13] Al-Kindi is thus regarded as the first codebreaker in history.[14] His breakthrough work was influenced by Al-Khalil (717–786), who wrote the Book of Cryptographic Messages, which contains the first use of permutations and combinations to list all possible Arabic words with and without vowels.[15]

Frequency analysis is the basic tool for breaking most classical ciphers. In natural languages, certain letters of the alphabet appear more often than others; in English, "E" is likely to be the most common letter in any sample of plaintext. Similarly, the digraph "TH" is the most likely pair of letters in English, and so on. Frequency analysis relies on a cipher failing to hide these statistics. For example, in a simple substitution cipher (where each letter is simply replaced with another), the most frequent letter in the ciphertext would be a likely candidate for "E". Frequency analysis of such a cipher is therefore relatively easy, provided that the ciphertext is long enough to give a reasonably representative count of the letters of the alphabet that it contains.[16]

Al-Kindi's invention of the frequency analysis technique for breaking monoalphabetic substitution ciphers[17][18] was the most significant cryptanalytic advance until World War II. Al-Kindi's Risalah fi Istikhraj al-Mu'amma described the first cryptanalytic techniques, including some for polyalphabetic ciphers, cipher classification, Arabic phonetics and syntax, and most importantly, gave the first descriptions on frequency analysis.[19] He also covered methods of encipherments, cryptanalysis of certain encipherments, and statistical analysis of letters and letter combinations in Arabic.[20][13] An important contribution of Ibn Adlan (1187–1268) was on sample size for use of frequency analysis.[15]

In Europe, Italian scholar Giambattista della Porta (1535–1615) was the author of a seminal work on cryptanalysis, De Furtivis Literarum Notis.[21]

Successful cryptanalysis has undoubtedly influenced history; the ability to read the presumed-secret thoughts and plans of others can be a decisive advantage. For example, in England in 1587, Mary, Queen of Scots was tried and executed for treason as a result of her involvement in three plots to assassinate Elizabeth I of England. The plans came to light after her coded correspondence with fellow conspirators was deciphered by Thomas Phelippes.

In Europe during the 15th and 16th centuries, the idea of a polyalphabetic substitution cipher was developed, among others by the French diplomat Blaise de Vigenère (1523–96).[22] For some three centuries, the Vigenère cipher, which uses a repeating key to select different encryption alphabets in rotation, was considered to be completely secure (le chiffre indéchiffrable—"the indecipherable cipher"). Nevertheless, Charles Babbage (1791–1871) and later, independently, Friedrich Kasiski (1805–81) succeeded in breaking this cipher.[23] During World War I, inventors in several countries developed rotor cipher machines such as Arthur Scherbius' Enigma, in an attempt to minimise the repetition that had been exploited to break the Vigenère system.[24]

Ciphers from World War I and World War II[edit]

The decrypted Zimmermann Telegram.

In World War I, the breaking of the Zimmermann Telegram was instrumental in bringing the United States into the war. In World War II, the Allies benefitted enormously from their joint success cryptanalysis of the German ciphers – including the Enigma machine and the Lorenz cipher – and Japanese ciphers, particularly 'Purple' and JN-25. 'Ultra' intelligence has been credited with everything between shortening the end of the European war by up to two years, to determining the eventual result. The war in the Pacific was similarly helped by 'Magic' intelligence.[25]

Cryptanalysis of enemy messages played a significant part in the Allied victory in World War II. F. W. Winterbotham, quoted the western Supreme Allied Commander, Dwight D. Eisenhower, at the war's end as describing Ultra intelligence as having been "decisive" to Allied victory.[26] Sir Harry Hinsley, official historian of British Intelligence in World War II, made a similar assessment about Ultra, saying that it shortened the war "by not less than two years and probably by four years"; moreover, he said that in the absence of Ultra, it is uncertain how the war would have ended.[27]

In practice, frequency analysis relies as much on linguistic knowledge as it does on statistics, but as ciphers became more complex, mathematics became more important in cryptanalysis. This change was particularly evident before and during World War II, where efforts to crack Axis ciphers required new levels of mathematical sophistication. Moreover, automation was first applied to cryptanalysis in that era with the Polish Bomba device, the British Bombe, the use of punched card equipment, and in the Colossus computers – the first electronic digital computers to be controlled by a program.[28][29]

Indicator[edit]

With reciprocal machine ciphers such as the Lorenz cipher and the Enigma machine used by Nazi Germany during World War II, each message had its own key. Usually, the transmitting operator informed the receiving operator of this message key by transmitting some plaintext and/or ciphertext before the enciphered message. This is termed the indicator, as it indicates to the receiving operator how to set his machine to decipher the message.[30]

Poorly designed and implemented indicator systems allowed first Polish cryptographers[31] and then the British cryptographers at Bletchley Park[32] to break the Enigma cipher system. Similar poor indicator systems allowed the British to identify depths that led to the diagnosis of the Lorenz SZ40/42 cipher system, and the comprehensive breaking of its messages without the cryptanalysts seeing the cipher machine.[33]

Depth[edit]

Sending two or more messages with the same key is an insecure process. To a cryptanalyst the messages are then said to be "in depth."[34][35] This may be detected by the messages having the same indicator by which the sending operator informs the receiving operator about the key generator initial settings for the message.[36]

Generally, the cryptanalyst may benefit from lining up identical enciphering operations among a set of messages. For example, the Vernam cipher enciphers by bit-for-bit combining plaintext with a long key using the "exclusive or" operator, which is also known as "modulo-2 addition" (symbolized by ⊕ ):

Plaintext ⊕ Key = Ciphertext

Deciphering combines the same key bits with the ciphertext to reconstruct the plaintext:

Ciphertext ⊕ Key = Plaintext

(In modulo-2 arithmetic, addition is the same as subtraction.) When two such ciphertexts are aligned in depth, combining them eliminates the common key, leaving just a combination of the two plaintexts:

Ciphertext1 ⊕ Ciphertext2 = Plaintext1 ⊕ Plaintext2

The individual plaintexts can then be worked out linguistically by trying probable words (or phrases), also known as "cribs," at various locations; a correct guess, when combined with the merged plaintext stream, produces intelligible text from the other plaintext component:

(Plaintext1 ⊕ Plaintext2) ⊕ Plaintext1 = Plaintext2

The recovered fragment of the second plaintext can often be extended in one or both directions, and the extra characters can be combined with the merged plaintext stream to extend the first plaintext. Working back and forth between the two plaintexts, using the intelligibility criterion to check guesses, the analyst may recover much or all of the original plaintexts. (With only two plaintexts in depth, the analyst may not know which one corresponds to which ciphertext, but in practice this is not a large problem.) When a recovered plaintext is then combined with its ciphertext, the key is revealed:

Plaintext1 ⊕ Ciphertext1 = Key

Knowledge of a key then allows the analyst to read other messages encrypted with the same key, and knowledge of a set of related keys may allow cryptanalysts to diagnose the system used for constructing them.[33]

Development of modern cryptography[edit]

Governments have long recognized the potential benefits of cryptanalysis for intelligence, both military and diplomatic, and established dedicated organizations devoted to breaking the codes and ciphers of other nations, for example, GCHQ and the NSA, organizations which are still very active today.

The Bombe replicated the action of several Enigma machines wired together. Each of the rapidly rotating drums, pictured above in a Bletchley Park museum mockup, simulated the action of an Enigma rotor.

Even though computation was used to great effect in the cryptanalysis of the Lorenz cipher and other systems during World War II, it also made possible new methods of cryptography orders of magnitude more complex than ever before. Taken as a whole, modern cryptography has become much more impervious to cryptanalysis than the pen-and-paper systems of the past, and now seems to have the upper hand against pure cryptanalysis.[citation needed] The historian David Kahn notes:[37]

Many are the cryptosystems offered by the hundreds of commercial vendors today that cannot be broken by any known methods of cryptanalysis. Indeed, in such systems even a chosen plaintext attack, in which a selected plaintext is matched against its ciphertext, cannot yield the key that unlock[s] other messages. In a sense, then, cryptanalysis is dead. But that is not the end of the story. Cryptanalysis may be dead, but there is – to mix my metaphors – more than one way to skin a cat.

Kahn goes on to mention increased opportunities for interception, bugging, side channel attacks, and quantum computers as replacements for the traditional means of cryptanalysis. In 2010, former NSA technical director Brian Snow said that both academic and government cryptographers are "moving very slowly forward in a mature field."[38]

However, any postmortems for cryptanalysis may be premature. While the effectiveness of cryptanalytic methods employed by intelligence agencies remains unknown, many serious attacks against both academic and practical cryptographic primitives have been published in the modern era of computer cryptography:[citation needed]

Thus, while the best modern ciphers may be far more resistant to cryptanalysis than the Enigma, cryptanalysis and the broader field of information security remain quite active.[39]

Symmetric ciphers[edit]

Asymmetric ciphers[edit]

Asymmetric cryptography (or public-key cryptography) is cryptography that relies on using two (mathematically related) keys; one private, and one public. Such ciphers invariably rely on "hard" mathematical problems as the basis of their security, so an obvious point of attack is to develop methods for solving the problem. The security of two-key cryptography depends on mathematical questions in a way that single-key cryptography generally does not, and conversely links cryptanalysis to wider mathematical research in a new way.[citation needed]

Asymmetric schemes are designed around the (conjectured) difficulty of solving various mathematical problems. If an improved algorithm can be found to solve the problem, then the system is weakened. For example, the security of the Diffie–Hellman key exchange scheme depends on the difficulty of calculating the discrete logarithm. In 1983, Don Coppersmith found a faster way to find discrete logarithms (in certain groups), and thereby requiring cryptographers to use larger groups (or different types of groups). RSA's security depends (in part) upon the difficulty of integer factorization – a breakthrough in factoring would impact the security of RSA.[40]

In 1980, one could factor a difficult 50-digit number at an expense of 1012 elementary computer operations. By 1984 the state of the art in factoring algorithms had advanced to a point where a 75-digit number could be factored in 1012 operations. Advances in computing technology also meant that the operations could be performed much faster. Moore's law predicts that computer speeds will continue to increase. Factoring techniques may continue to do so as well, but will most likely depend on mathematical insight and creativity, neither of which has ever been successfully predictable. 150-digit numbers of the kind once used in RSA have been factored. The effort was greater than above, but was not unreasonable on fast modern computers. By the start of the 21st century, 150-digit numbers were no longer considered a large enough key size for RSA. Numbers with several hundred digits were still considered too hard to factor in 2005, though methods will probably continue to improve over time, requiring key size to keep pace or other methods such as elliptic curve cryptography to be used.[citation needed]

Another distinguishing feature of asymmetric schemes is that, unlike attacks on symmetric cryptosystems, any cryptanalysis has the opportunity to make use of knowledge gained from the public key.[41]

Attacking cryptographic hash systems[edit]

Side-channel attacks[edit]

Quantum computing applications for cryptanalysis[edit]

Quantum computers, which are still in the early phases of research, have potential use in cryptanalysis. For example, Shor's Algorithm could factor large numbers in polynomial time, in effect breaking some commonly used forms of public-key encryption.[42]

By using Grover's algorithm on a quantum computer, brute-force key search can be made quadratically faster. However, this could be countered by doubling the key length.[43]

See also[edit]

  • Economics of security
  • Global surveillance – Mass surveillance across national borders
  • Information assurance – Multi-disciplinary methods for decision support systems security, a term for information security often used in government
  • Information security – Protecting information by mitigating risk, the overarching goal of most cryptography
  • National Cipher Challenge
  • Security engineering – Process of incorporating security controls into an information system, the design of applications and protocols
  • Security vulnerability – Exploitable weakness in a computer system; vulnerabilities can include cryptographic or other flaws
  • Topics in cryptography – Overview of and topical guide to cryptography
  • Zendian Problem – An exercise in communication intelligence

Historic cryptanalysts[edit]

References[edit]

Citations[edit]

  1. ^ "Cryptanalysis/Signals Analysis". Nsa.gov. 2009-01-15. Retrieved 2013-04-15.
  2. ^ Dooley, John F. (2018). History of Cryptography and Cryptanalysis: Codes, Ciphers, and Their Algorithms. History of Computing. Cham: Springer International Publishing. doi:10.1007/978-3-319-90443-6. ISBN 978-3-319-90442-9. S2CID 18050046.
  3. ^ Shannon, Claude (4 October 1949). "Communication Theory of Secrecy Systems". Bell System Technical Journal. 28 (4): 662. doi:10.1002/j.1538-7305.1949.tb00928.x. Retrieved 20 June 2014.
  4. ^ Kahn, David (1996), The Codebreakers: the story of secret writing (second ed.), Scribners, p. 235
  5. ^ Schmeh, Klaus (2003). Cryptography and public key infrastructure on the Internet. John Wiley & Sons. p. 45. ISBN 978-0-470-84745-9.
  6. ^ Hellman, M. (July 1980). "A cryptanalytic time-memory trade-off" (PDF). IEEE Transactions on Information Theory. 26 (4): 401–406. doi:10.1109/tit.1980.1056220. ISSN 0018-9448. S2CID 552536. Archived (PDF) from the original on 2022-10-10.
  7. ^ McDonald, Cameron; Hawkes, Philip; Pieprzyk, Josef, SHA-1 collisions now 252 (PDF), retrieved 4 April 2012
  8. ^ a b Schneier 2000
  9. ^ For an example of an attack that cannot be prevented by additional rounds, see slide attack.
  10. ^ Kahn, David (1996). The Codebreakers: The Comprehensive History of Secret Communication from Ancient Times to the Internet. Simon and Schuster. ISBN 9781439103555.
  11. ^ Al-Jubouri, I. M. N. (February 22, 2004). History of Islamic Philosophy: With View of Greek Philosophy and Early History of Islam. Authors On Line Ltd. ISBN 9780755210114 – via Google Books.
  12. ^ Leaman, Oliver (July 16, 2015). The Biographical Encyclopedia of Islamic Philosophy. Bloomsbury Publishing. ISBN 9781472569455 – via Google Books.
  13. ^ a b Ibrahim A. Al-Kadi (April 1992), "The origins of cryptology: The Arab contributions", Cryptologia 16 (2): 97–126
  14. ^ Sahinaslan, Ender; Sahinaslan, Onder (2 April 2019). "Cryptographic methods and development stages used throughout history". AIP Conference Proceedings. 2086 (1): 030033. Bibcode:2019AIPC.2086c0033S. doi:10.1063/1.5095118. ISSN 0094-243X. Al-Kindi is considered the first code breaker
  15. ^ a b Broemeling, Lyle D. (1 November 2011). "An Account of Early Statistical Inference in Arab Cryptology". The American Statistician. 65 (4): 255–257. doi:10.1198/tas.2011.10191. S2CID 123537702.
  16. ^ Singh 1999, p. 17
  17. ^ Leaman, Oliver (16 July 2015). The Biographical Encyclopedia of Islamic Philosophy. Bloomsbury Publishing. ISBN 9781472569455. Retrieved 19 March 2018 – via Google Books.
  18. ^ Al-Jubouri, I. M. N. (19 March 2018). History of Islamic Philosophy: With View of Greek Philosophy and Early History of Islam. Authors On Line Ltd. ISBN 9780755210114. Retrieved 19 March 2018 – via Google Books.
  19. ^ Simon Singh, The Code Book, pp. 14–20
  20. ^ "Al-Kindi, Cryptgraphy, Codebreaking and Ciphers". Retrieved 12 January 2007.
  21. ^ "Crypto History". Archived from the original on August 28, 2008.
  22. ^ Singh 1999, pp. 45–51
  23. ^ Singh 1999, pp. 63–78
  24. ^ Singh 1999, p. 116
  25. ^ Smith 2000, p. 4
  26. ^ Winterbotham 2000, p. 229.
  27. ^ Hinsley 1993.
  28. ^ Copeland 2006, p. 1
  29. ^ Singh 1999, p. 244
  30. ^ Churchhouse 2002, pp. 33, 34
  31. ^ Budiansky 2000, pp. 97–99
  32. ^ Calvocoressi 2001, p. 66
  33. ^ a b Tutte 1998
  34. ^ Churchhouse 2002, p. 34
  35. ^ The Bletchley Park 1944 Cryptographic Dictionary defined a depth as
    1. A series of code messages reciphered with the same, or the same part of a, reciphering key especially when written under one another so that all the groups (usually one in each message) that are reciphered with the same group of the subtractor lie under each other and form a 'column'.
    (b) two or more messages in a transposition cipher that are of the same length and have been enciphered on the same key;
    (c) two or more messages in a machine or similar cipher that have been enciphered on the same machine-setting or on the same key.
    2. be in depth : (of messages). Stand to each other in any of the relationships described above.
    The Bletchley Park 1944 Cryptographic Dictionary formatted by Tony Sale (c) 2001 (PDF), p. 27
  36. ^ Churchhouse 2002, pp. 33, 86
  37. ^ David Kahn Remarks on the 50th Anniversary of the National Security Agency, November 1, 2002.
  38. ^ Tim Greene, Network World, Former NSA tech chief: I don't trust the cloud Archived 2010-03-08 at the Wayback Machine. Retrieved March 14, 2010.
  39. ^ "An Overview of Cryptography". www.garykessler.net. Retrieved 2019-06-03.
  40. ^ https://pages.cs.wisc.edu/~cs812-1/coppersmith.pdf
  41. ^ Stallings, William (2010). Cryptography and Network Security: Principles and Practice. Prentice Hall. ISBN 978-0136097044.
  42. ^ "Shor's Algorithm – Breaking RSA Encryption". AMS Grad Blog. 2014-04-30. Retrieved 2017-01-17.
  43. ^ Daniel J. Bernstein (2010-03-03). "Grover vs. McEliece" (PDF). Archived (PDF) from the original on 2022-10-10.

Sources[edit]

Further reading[edit]

External links[edit]