Talk:Massey-Omura cryptosystem

New stuff
I have some small points about whether S=1 is a good key, which are near the end, but first let me try the Gordian Knot approach to the entire project.

Right now, someone reading this article would still assume that Massey and Omura invented the 3-pass approach. I don't think that is correct; I believe Shamir had this idea first. The protocol is described in Alan Konheim's book Cryptography: A Primer which was published in 1981, before the first filing of the Massey-Omura patent disclosure in Switzerland(?) in Feb 1982. The Konheim book is cited in the Massey-Omura disclosure. Therefore Shamir should be given credit for 2 separate ideas, the 3-pass protocol, and the embodiment using powers modulo a large prime.

I propose a complete reorganization of this subject. The main article would be entitled Three-Pass Protocol, or possibly Three-Pass Protocol (Cryptography). This main article would first discuss the 3-pass concept in an abstract way, without specifying any particular encryption of decryption functions.

After introducing the main idea, there would be two sections describing first the Shamir algorithm using powers of integers modulo a prime, and then the Massey-Omura protocol using powers in a Galois field. When newer methods become known, additional sections can describe them.

The next section could contain anything you want to say about the relative speed and security of the two methods. This can be expanded later if more methods appear.

The last section could talk about the lack of authentication, which is common to all 3-pass methods.

The existing Massey-Omura article would be deleted, and replaced by a link redirecting the reader to the new, more complete article. Much of the text can be transferred intact. Likewise there would be an article called Shamir Three-Pass Protocol which would redirect the reader to the omnibus article. Both methods should get equal treatment.

As far as their relative merit, I think we can at least say that shifting is always faster than multiplying for large numbers of equal size. You obviously have up-to-date information on how big the primes must be, and how big the GF must be, to meet current standards of security. So either you can write those sections yourself, or you can supply me with accurate info and I can write it.

I think we can stick with the abstraction that "a message" is raised to a power, and skip the nitty details of dividing messages into blocks, or compressing messages for speed. Contestcen (talk) 05:56, 5 December 2007 (UTC)


 * Describing Shamir's three pass protocol and the Massey-Omura protocol in the same article is ok. The protocol is already described in a generic way that fits both variants. The title Three-Pass Protocol will lead to confusion because the name three-pass protocol is used in the literature for many more scheme than just thosse two protocols. E.g., three-pass protocol is often used to denote identification or mutual authentication protocols that happen to require 3 rounds. Btw, Galois field is an equivalent to finite field, thus both Shamir and Massey-Omura are using Galois field. Just Shamir uses the prime order fields and Massey-Omura the binary fields. As you suggest there is no need to describe any kind of message padding. After all thhat's what the no original research rule WP:OR suggests. The idea behind wikipedia is to reflect known facts and not to fill in for non-existing standards and such. 85.2.80.168 (talk) 18:08, 5 December 2007 (UTC)

(Minor point) Padding and blocking are related, but they are not the same thing. If you are using, say, powers modulo a 127-bit prime, then you are limited to transmitting units of 126 bits (or 126 + a fraction if you want to get fancy) at a time. Dividing the message into 126-bit units is blocking. What you add in the last unit (or distribute throughout the message in more elaborate schemes) is padding. If you transmit a message-length indicator, then you can have blocking without padding.

I believe these terms are long-established. I would certainly not discuss any specific methods in this article. Padding with nulls has been in use in cryptography for hundreds of years, and would not represent any sort of original research. On the other hand, padding to avoid having message blocks which are 0 or 1 might introduce something new.

(Major point) I don't have a better name. It would be too restrictive to call it the Shamir Three-Pass Protocol, even if Shamir invented it, because the abstract protocol also encompasses the Massey-Omura algorithm. The name Three-Pass Message Protocol is more descriptive, but I have never seen that name used anywhere. I don't think Wiki users would be likely to search on that specific phrase. The Wiki search engine is finicky about getting the exact wording and spelling. (That's also a reason not to include the name Shamir in the title. It's a better stategy to have a separate article entitled Shamir Three-Pass Protocol which redirects the reader to this article.)

There is no Wiki article currently titled Three-Pass Protocol, so my feeling is that we should grab this title, and anyone who comes along later and wants another Wiki article for some other 3-pass algorithm would have to qualify their title, say Three-Pass Protocol (authentication). It would be appropriate to say within each section that the original algorithm is called the Shamir Three-Pass Protocol, and the newer one is called the Massey-Omura Cryptosystem, assuming those are the most common names.

BTW, can you supply a date when the Shamir protocol was first invented? We can use the date of the Swiss patent filing for the Massey-Omura system, but presumably the Shamir system predates the Konheim book. Contestcen (talk) 20:30, 5 December 2007 (UTC)


 * Most moder cryptosystems use a hybrid approach, where first a key is exchanged, and then this key is used to encrypt messages of any size with a block or stream cipher. A hybrid scheme has the advantage that only the key is sent three times and that the message is only sent once. BTW, Massey and Omura propose such a mode in their patent. Hence, I'd be surprised if anyone would actually use 'blocking' in the way you discuss it. Also I fear that you underestimate the difficulties that arise for selecting an appropriate padding. There are a number of research papers discussing the subject in length, but without seeing an authentitication first, it is hard to tell what kind of padding would be appropriate. Using a null-padding would not only be original research, it would most likely be insecure too. Hundred years ago active attacks, such as the meet-in-the-middle attack were not considered a serios threat.
 * As explained above 'Three-Pass protocol' would be inappropriate. E.g. just a search through the 'Handbook of applied cryptography' gives more than 20 references to the term 'Three-Pass protocol', which have nothing to do with Shamir's protocol or the Massey Omura cryptosystem. Since wikipedia is an encyclopedia a term should be used in the same way as it is used in the literature and not as you propose with a different meaning. Shamir's protocol is frequently called Shamir's no-key protocol. Unfortunately, I don't have any good reference for defining the term no-key protocol. Another potential solution might be to have a short article for Shamir's protocol with a note that more details are covered in the Massey Omura article. 85.2.16.74 (talk) 15:58, 8 December 2007 (UTC)

Sometimes you just have to take the best of the available options, even if none of them is ideal. So far Three-pass protocol is the best name I have seen. I understand there are several different algorithms that are called three-pass protocols. Luckily Wikipedia already has a mechanism for disambiguation, so if anyone in the future wants another article called Three-pass protocol that mechanism will kick in automatically.

The need for hybrid systems is largely due to the fact that raising a number to a large power is so slow. Faster 3-pass algorithms might not require this.

The idea that using nulls for padding would be original research is ludicrous. The first transcontinental telegraph was completed in 1861. The Confederacy used the telegraph for military messages during the Civil War, and the convention was to transmit encrypted messages in blocks of 5 characters. Nulls were used to make the message length a multiple of 5. Union cryptographers were able to exploit these nulls to solve messages when the Confederate signalmen made the error of putting all of the nulls at the end. Contestcen (talk) 01:58, 10 December 2007 (UTC)


 * The question was whether proposing a padding scheme for the Massey-Omura cryptosystem would be original research and not whether null padding is known. So your argument is not relevant here. Other reasons for using hybrid schemes are bandwidth (as discussed above) or the possiblity to do multicasting. Anyway, wikipedia isn't the right place to speculate about future systems. I've searched through several books, none of which contained a definition of Three-pass protocol the way you want to define it. Do you have any sources? 85.2.91.160 (talk) 18:24, 10 December 2007 (UTC)

Actually, I said blocking and you were the one who said padding. I was trying to explain that I really meant blocking and not padding, which I understand to be a different concept. I never suggested saying anything at all about padding in the Wiki article, much less proposing any sort of new padding scheme. So you seem to arguing about something I never said.

But let's get back to the main problem, namely what title to use for the article which will discuss the three-pass message protocol in an abstract way. I still feel that Three-Pass Protocol is the best title. I think that is the phrase that Konheim used in his 1981 book. The term is not a hapax legomenon; it has been in use ever since. Unless you can propose a better title, that is what I will use.

BTW, I tried a Google search on the phrase Three-Pass Protocol and the first item that showed up was http://www.quadibloc.com/crypto/pk0504.htm which seems to use the phrase in the same way I understand it. However, the author (John J. G. Savard) attributes Shamir's prime-modulus method to Massey and Omura. It was precisely to clear up such misattribution that I originally revised the Massey-Omura Wiki article. Contestcen (talk) 07:00, 12 December 2007 (UTC)


 * First, I apologize for turning your blocking into padding. Obviously we have different ideas on how to use the protocol in a real application.
 * Your reference does not support the definition that you gave on the page Three-pass protocol. It calls the scheme Shamir three-pass protocol. It does not define the Three-Pass protocol as a protocol that is restricted to protocols for exchanging messages without key exchange. But let's move the discussion to the three-pass protocol page that you already started. 85.2.50.189 (talk) 08:59, 13 December 2007 (UTC)

Old stuff
You asked me to supply references. I inserted the appropriate refences into the Wiki links, but apparently you erased my last version of the speed claim without looking at them. Please let me know in this space if you see this note. Then we can begin a dialogue to resolve our differences. Just erasing my contributions does not help. Contestcen (talk) 05:52, 30 November 2007 (UTC)
 * Nope, you did not provide us with a verifiable source that HP did produce a chip implementing the Massey-Omura cryptosystem. Neither the handbook of cryptography nor the wikipedia page of HP that you linked to mentions such a chip. Nor can I find any evidence myself. There are too many open questions. If they produced such a chip, did they indeed produce a chip implementing the cryptosystem or did they produce a more general purpose chip for arithmetic in GF(2m)? Did they produce sufficiently many chips to make this notable? Wikipedia is not the place to post rumors. You have to back up your claims with some evidence.85.2.20.162 11:16, 1 December 2007 (UTC)

The two Wiki references that I supplied in the most-recently deleted paragraph were to back up my claims concerning the speed of the two algorithms. They had nothing to do with the chip. For multiplying numbers of the size we are considering here, presumably from about 127 bits to 521 bits (2^521-1 is the next Mersenne prime after 2^127-1) Toom3 is not practical, and Karatsuba gives only a moderate improvement.

For the exponentiation, you can never get as much as a 2-fold improvement by using any of the chained-addition techniques. In practice, for a randomly chosen m-bit exponent, the number of squarings is m-1, and the number of additional multiplications tends to be between m/(log2 m) and m/2(log2 m) assuming a window size of log2 m, and not counting the multiplications used in the pre-computation. So the total number of multiplications will be circa (1+1/(log2 m))m instead of 2m. Some implementers, especially in hardware, may choose to forego this improvement in favor of simplicity, in which case the number of multiplications will, indeed, be 2m. In fact, the first time I implemented the Shamir protocol I did exactly that.

The optimal window size is something I have on my to-do list to investigate, but I lean towards limiting the window size to a small fixed maximum such as 7 bits or perhaps 10 bits, since this means I can used a fixed-size table.

I did some experimenting with window size over this weekend, and I found that I could get no more than a 10% speedup. The speedup barely changed as I varied the window size from 2 bits up to 7 bits. Making the window size vary depending on the size of the exponent did no better than fixing the window size at 3 bits.

In any case I was careful to word my speed claim with the phrase up to 2m multiplications, which assured that my statement would be true.

As far as the chip is concerned, I read in an authoritative source a one-sentence statement that Hewlett-Packard had produced a hardware chip for a 126-bit version of the Massey-Omura algorithm. This has always stuck in my mind because I could never understand why they used 126 bits instead of 127. I don't think the chip is just a "rumor" but I understand that it should not go into the Wiki article until a source is found. (If you have wide contacts in the crypto community, you might try asking around.) Contestcen 19:40, 1 December 2007 (UTC)


 * First, we should consider Massey-Omura over fields such as GF(21279) or GF(22203), since discrete logarithms in GF(2607) have been computed . Computing discrete logs in GF(p) with p prime seems significatnly harder . A fair comparison should take these differences into account and should of course use optimized implementations for both types of fields (otherwise it wouldn't be a fair comparison). There are enough papers discussing addition chain based exponentiation. So citing a paper for this should not be a problem. A 10% speedup seems a little low to me. What I can't find is a paper with a comparison between DL based schemes in GF(2^m) and GF(p). There just isn't much interest for schemes based on the difficulty of computing DLs in GF(2^m). The fast attacks seem to be one of the reasons for avoiding them.
 * A chip for Massey-Omura over GF(2126) would have been a toy. Since 2126-1 has many prime factors and the largest one is only about 36 bits long, DLs can be computed fast using Pohlig-Hellman. 85.2.1.38 11:09, 4 December 2007 (UTC)

Range 2 to 2^m-2
The reason that 1 should not be used as an encrypting exponent is this would result in either the sender or the receiver transmitting M^1=M, that is, the message in clear. I realize that the patent states the allowable range as 1 to 2^m-2. This was an oversight on the part of Massey and Omura when they wrote the patent disclosure, but since we are now aware of the problem we should not perpetuate the error. The only purpose for repeating the error would be to point out that Massey and Omura made a mistake, and I feel that would be petty. Contestcen 16:31, 30 November 2007 (UTC)
 * Wow, now you are accusing Massey and Omura of making mistakes too, to support you claim. They are well aware that M can be mapped to M. In fact, if you read the patent you can find the following explanation contained in the detailed description:
 * "There is a third and still more compelling reason for preferring a choice of m such that p=2m -1 is a prime. When p=2m -1 is a prime, then every element B of GF(2m), excepting 0 and 1, is a primitive element. This means that Bi takes on the value of each of these 2m -2 elements for i=1, 2, . . ., 2m -2. Thus, when and only when p=2m -1 is a prime, will $$Y_1 =M^{E_J}$$ take on different values for all 2m-2 possible values of EJ no matter what choice is made of the message M (which, it will be recalled, can be any element of GF(2m) except 0 or 1). Similarly, when and only when p=2m -1 is a prime, will $$Y_2 =Y_1^{E_K}$$ take on different values for all 2m -2 possible values of EK, regardless of the value of Y1, and will $$Y_3 =Y_2^{D_J}$$ take on different values for all 2m -2 possible values of DJ regardless of the value of Y2. This maximization of the number of potential values of Y1, Y2 and Y3 maximizes the difficulty to the cryptanalyst who attempts to determine the private message M from knowledge only of the public messages Y1, Y2 and Y3."
 * Hence they are aware that (if 2m-1 is prime) any message M can be encrypted as any element in GF(2m)/{0,1}, which of course includes M itself. Or putting it differently, the MS does not give any information about M. Furthermore, an attacker can check any guess e.g. for the key S by raising MR to the power S and compare the result with MRS. Hence S=1 isn't any worse than any value for a key. Also observe that the probility that an attacker can find the chosen keys is vanishingly small. Maybe one source for your confusion is that you didn't realize that the keys can be chosen differently for every message sent. In chosing per-message keys is what Massey and Omura propose in the patent. Here is a quote:
 * "As soon as User J has a private message M to send, User J calls upon his special random number generator to produce the pair of integers EJ and DJ as indicated by block 5 in FIG. 1. He then uses the first of these two integers to perform the exponentiation indicated by block 1 in FIG. 1 in order to calculate the element ..."
 * There are some variants later, but per message keys is the main way to perform the protocol. The wikipedia article is quite ambigous right now. But I'll change that. 85.2.20.162 12:43, 1 December 2007 (UTC)

I think I understand your point now. For simplicity, let's assume that the key is 128 bits and each message block is 16 8-bit characters aligned on byte boundaries. Now, suppose I see a 16-character message block CHERRY_PIE_RECIP. Your question is, how can I know that is cleartext transmitted with the key S=1? How do I know the real message block is not, for example, PLUTONIUM_ENRICH transmitted with the key S=35439287568408916578?

There are two reasons:

(1) Shannon estimated that English text has entropy of about 3.2 bits per character, so the the number of 16-character message blocks which have the characteristics of English would be about 216•3.2, or roughly 251. So the probability of 16 encrypted 8-bit characters being plausible English is roughly 251/2128, or about 2-77, and the chance that the block forms a meaningful message fragment appropriate to both sender and receiver is far smaller. So if I see English text, it's overwhelmingly probable that is a message block sent in clear.

(2) Even more importantly, if the sender chooses the value S=1, then the second encrypted message MSR will be identical to the third message MR. This would be an obvious tip-off to the eavesdropper that the first message has been sent in clear. Even if the message iself is totally random, such as an encryption key for some other system, the eavesdropper would realize that the first message must be M1=M. So I repeat, S=1 and R=1 are not suitable encryption keys.

In practice it's probably smart to avoid any key less than, say, 1016 to prevent a brute-force search for short keys. You should probably avoid keys from 2m-1016 to 2m-2 for the same reason, but I think this goes beyond the scope of this article.

But as far as catching people in mistakes, I once caught Erdös Pál (Paul Erdös) in a whopper. Even the brightest of the bright can make mistakes sometimes. Massey and Omura are not immune. Contestcen 19:40, 1 December 2007 (UTC)


 * To (1): Assuming 2m-1 is prime. Then if GF(2m)/{0,1} contains (as you assume) 251 elements corresponding to English messages then the probability that MR is an English message is 251 times larger than the probability that it is M itselft. The reson for this is as follows: If R is chosen uniformly at random in the range [1..2m-2] and M is not equal to 0 or 1 then MR takes on every value in GF(2m) except 0 or 1 with equal probabilty.
 * To (2): Given MR and MRS it is always possible to try to guess the key S and check if the guess was correct. Removing 1016 keys is not helpful. Firstly, an attacker would then simply do a key search with another still valid range of keys. Secondly, various methods for computing discrete logs are faster than brute force. E.g. the even the good old Baby-step giant-step algorithm can check N2 keys in time O(N). It is not possible to choose keys such that an attacker can not get lucky. But we can choose the key range large enough that the probability of a lucky attacker can be ignored. 85.2.1.38 11:09, 4 December 2007 (UTC)

(1) That assumes that all keys are equally likely. But if keys are drawn at random from 1 to 2127-2 then that 1 in 277 chance of seeing English just will never occur. The only way you will see normal English is when S=1 was chosen deliberately, or by restricting the key to a vastly smaller range, such as 231-1, as you might get from a multiplicative congruential PRNG. I have been doing cryptography for a long time, and have seen thousands of encrypted messages. Never once have I seen a strong substitution cipher produce an English word longer than 5 letters in ciphertext.

(2) The choice S=1 is qualitatively different from any other choice. When S=1 you have MR=MSR. There is no brute force trial, no big-step little-step, no index calculus, no number field sieve needed to find S. It is right out there to see, before you ever start setting up any of those methods. If you are doing any kind of systematic monitoring of some enemy's communications, then you are going to be looking for bits of plaintext, because these give you clues about subject matter, about where message sequence numbers, or key indicators are located, about which encryption scheme might be employed. Such plaintext snippets can occur even in heavily encrypted traffic. "Repeating last block" or "Please send again" or "Message 1067 retracted, replace by 1104" or "Unit S3 offline, send no messages until notified." Sometimes you just need to step outside the encrypted traffic when something unusual happens and you need to address the human operator instead of the computer. If the opponent is watching for these, then he will also notice message blocks sent in clear. Contestcen (talk) 05:56, 5 December 2007 (UTC)


 * That's not a usefule argument. If your pseudorandom number generator only generates 231 values then you have far bigger problems than seeing an unencrypted message every billionth time. Avoiding S=1 in that case doesn't help you anything. A little program could check all of those 231 keys in just a few milliseconds. 85.2.80.168 (talk) 17:52, 5 December 2007 (UTC)

You are missing the point. The sender does not make public the fact that he is omitting the first 1016 keys, or that he is using a trivial 31-bit PRNG. He could be choosing keys by throwing dice, or by noticing the license plates of passing cars, but the opponent would not know this. You are getting too involved in the details of the examples, and missing the point that is being illustrated. Contestcen (talk) 20:30, 5 December 2007 (UTC)


 * Please behave. After all it was you who started the argument by claiming that Massey and Omura erred in the patent by defining the allowable range for keys as 1..2m-2. If you could finally admit that they didn't make a mistake we could close this discussion. 85.2.91.160 (talk) 18:35, 10 December 2007 (UTC)

Speed comparison
I've removed the following text, which probably needs some explaining.
 * This gives the Massey-Omura method a speed advantage over the earlier Shamir Three-Pass Protocol, which uses powers modulo a large prime, thus requiring up to 2m multiplications and 2m modulus divisions for each encryption and decryption step when the prime has m bits.

Which of the protocols is faster to implement depends on several factors. First, it depends on whether the implementation is in hardware of software. Binary fields are often prefered over GF(p) for hardware implementations, because the operations are simple. On the other hand software implementations can take advantage of integer multiplications, hence GF(p) is frequently faster to implement than binary fields. Next, the claim that exponentiation in GF(p) requires up to 2m multiplications is wrong. E.g. window-based exponentiation can reduce that number significantly. GF(2m) allows of course similar (and possibly even better) improvements. Then it is unclear if discrete logs in fields of similar size are equally difficult. For example [] concludes that larger sizes for GF(2m) may be necessary. Hence, if we want to make a claim that Massey Omura is generally faster than the Shamir three pass protocol, we should at least back it up with some reference, so that it is a least possible to check the assumptions that lead to a respective claim. 85.1.106.251 (talk) 10:02, 29 November 2007 (UTC)

Redirect
In my opinion it would be better, if the the section on Massey-Omura of the article Three-pass protocol, would contain a link to the main article (this one). Because this acticle contains much more informations and contains more details on the protocol. Momet (talk) 16:01, 17 December 2007 (UTC)
 * I agree that the Massey-Omura article should be kept. A number of nice properties got lost in the transition from the specific Massey-Omura protocol to a more abstract form of the protocol, such as simplicity and accuracy. 85.2.48.141 (talk) 08:36, 18 December 2007 (UTC)

The need for a combined page did not become apparent until I began writing the Wiki article for the Shamir algorithm. So much of the material needed to be duplicated. I consulted with the editor, and he agreed to combine the pages.

Without the combined article it was too difficult to explain that there were 3 separate inventions, namely the three-pass protocol itself, and then the two exponentiation algorithms that fit within the protocol. (In patent terms, these are called "embodiments.") I felt it was important to show that Shamir really made two distinct contributions, and that Massey and Omura were building on the foundation laid by Shamir.

Nonetheless, if there is any specific information from the old page that you feel should be on the new page, please let me know, and I will try to add it.

What you might not know is that there are new 3-pass algorithms, including matrix algorithms and quantum algorithms, which follow the three-pass protocol. Combining the articles sets up a place where they can all be treated, and compared if appropriate. Contestcen (talk) 19:38, 20 December 2007 (UTC)