Secure multi-party computation

Secure multi-party computation (also known as secure computation, multi-party computation (MPC) or privacy-preserving computation) is a subfield of cryptography with the goal of creating methods for parties to jointly compute a function over their inputs while keeping those inputs private. Unlike traditional cryptographic tasks, where cryptography assures security and integrity of communication or storage and the adversary is outside the system of participants (an eavesdropper on the sender and receiver), the cryptography in this model protects participants' privacy from each other.

The foundation for secure multi-party computation started in the late 1970s with the work on mental poker, cryptographic work that simulates game playing/computational tasks over distances without requiring a trusted third party. Traditionally, cryptography was about concealing content, while this new type of computation and protocol is about concealing partial information about data while computing with the data from many sources, and correctly producing outputs. By the late 1980s, Michael Ben-Or, Shafi Goldwasser and Avi Wigderson, and independently David Chaum, Claude Crépeau, and Ivan Damgård, had published papers showing "how to securely compute any function in the secure channels setting".

History
Special purpose protocols for specific tasks started in the late 1970s. Later, secure computation was formally introduced as secure two-party computation (2PC) in 1982 (for the so-called Millionaires' Problem, a specific problem which is a Boolean predicate), and in generality (for any feasible computation) in 1986 by Andrew Yao. The area is also referred to as Secure Function Evaluation (SFE). The two party case was followed by a generalization to the multi-party by Oded Goldreich, Silvio Micali, and Avi Wigderson. The computation is based on secret sharing of all the inputs and zero-knowledge proofs for a potentially malicious case, where the majority of honest players in the malicious adversary case assure that bad behavior is detected and the computation continues with the dishonest person eliminated or his input revealed. This work suggested the very basic general scheme to be followed by essentially all future multi-party protocols for secure computing. This work introduced an approach, known as GMW paradigm, for compiling a multi-party computation protocol which is secure against semi-honest adversaries to a protocol that is secure against malicious adversaries. This work was followed by the first robust secure protocol which tolerates faulty behavior graciously without revealing anyone's output via a work which invented for this purpose the often used `share of shares idea' and a protocol that allows one of the parties to hide its input unconditionally. The GMW paradigm was considered to be inefficient for years because of huge overheads that it brings to the base protocol. However, it is shown that it is possible to achieve efficient protocols, and it makes this line of research even more interesting from a practical perspective. The above results are in a model where the adversary is limited to polynomial time computations, and it observes all communications, and therefore the model is called the `computational model'. Further, the protocol of oblivious transfer was shown to be complete for these tasks. The above results established that it is possible under the above variations to achieve secure computation when the majority of users are honest.

The next question to solve was the case of secure communication channels where the point-to-point communication is not available to the adversary; in this case it was shown that solutions can be achieved with up to 1/3 of the parties being misbehaving and malicious, and the solutions apply no cryptographic tools (since secure communication is available). Adding a broadcast channel allows the system to tolerate up to 1/2 misbehaving minority, whereas connectivity constraints on the communication graph were investigated in the book Perfectly Secure Message Transmission.

Over the years, the notion of general purpose multi-party protocols became a fertile area to investigate basic and general protocol issues properties on, such as universal composability or mobile adversary as in proactive secret sharing.

Since the late 2000s, and certainly since 2010 and on, the domain of general purpose protocols has moved to deal with efficiency improvements of the protocols with practical applications in mind. Increasingly efficient protocols for MPC have been proposed, and MPC can be now considered as a practical solution to various real-life problems (especially ones that only require linear sharing of the secrets and mainly local operations on the shares with not much interactions among the parties), such as distributed voting, private bidding and auctions, sharing of signature or decryption functions and private information retrieval. The first large-scale and practical application of multi-party computation was the execution of an electronic double auction in the Danish Sugar Beet Auction, which took place in January 2008. Obviously, both theoretical notions and investigations, and applied constructions are needed (e.g., conditions for moving MPC into part of day by day business was advocated and presented in ).

In 2020, a number of companies working with secure-multiparty computation founded the MPC alliance with the goal of "accelerate awareness, acceptance, and adoption of MPC technology."

Definition and overview
In an MPC, a given number of participants, p1, p2, ..., pN, each have private data, respectively d1, d2, ..., dN. Participants want to compute the value of a public function on that private data: F(d1, d2, ..., dN) while keeping their own inputs secret.

For example, suppose we have three parties Alice, Bob and Charlie, with respective inputs x, y and z denoting their salaries. They want to find out the highest of the three salaries, without revealing to each other how much each of them makes. Mathematically, this translates to them computing:

If there were some trusted outside party (say, they had a mutual friend Tony who they knew could keep a secret), they could each tell their salary to Tony, he could compute the maximum, and tell that number to all of them. The goal of MPC is to design a protocol, where, by exchanging messages only with each other, Alice, Bob, and Charlie can still learn $F(x, y, z) = max(x, y, z)$ without revealing who makes what and without having to rely on Tony. They should learn no more by engaging in their protocol than they would learn by interacting with an incorruptible, perfectly trustworthy Tony.

In particular, all that the parties can learn is what they can learn from the output and their own input. So in the above example, if the output is $z$, then Charlie learns that his $z$ is the maximum value, whereas Alice and Bob learn (if $x$, $y$ and $z$ are distinct), that their input is not equal to the maximum, and that the maximum held is equal to $z$. The basic scenario can be easily generalised to where the parties have several inputs and outputs, and the function outputs different values to different parties.

Informally speaking, the most basic properties that a multi-party computation protocol aims to ensure are:
 * Input privacy: No information about the private data held by the parties can be inferred from the messages sent during the execution of the protocol. The only information that can be inferred about the private data is whatever could be inferred from seeing the output of the function alone.
 * Correctness: Any proper subset of adversarial colluding parties willing to share information or deviate from the instructions during the protocol execution should not be able to force honest parties to output an incorrect result. This correctness goal comes in two flavours: either the honest parties are guaranteed to compute the correct output (a "robust" protocol), or they abort if they find an error (an MPC protocol "with abort").

There are a wide range of practical applications, varying from simple tasks such as coin tossing to more complex ones like electronic auctions (e.g. compute the market clearing price), electronic voting, or privacy-preserving data mining. A classical example is the Millionaires' Problem: two millionaires want to know who is richer, in such a way that neither of them learns the net worth of the other. A solution to this situation is essentially to securely evaluate the comparison function.

Security definitions
A multi-party computation protocol must be secure to be effective. In modern cryptography, the security of a protocol is related to a security proof. The security proof is a mathematical proof where the security of a protocol is reduced to that of the security of its underlying primitives. Nevertheless, it is not always possible to formalize the cryptographic protocol security verification based on the party knowledge and the protocol correctness. For MPC protocols, the environment in which the protocol operates is associated with the Real World/Ideal World Paradigm. The parties can't be said to learn nothing, since they need to learn the output of the operation, and the output depends on the inputs. In addition, the output correctness is not guaranteed, since the correctness of the output depends on the parties’ inputs, and the inputs have to be assumed to be correct.

The Real World/Ideal World Paradigm states two worlds: (i) In the ideal-world model, there exists an incorruptible trusted party to whom each protocol participant sends its input. This trusted party computes the function on its own and sends back the appropriate output to each party. (ii) In contrast, in the real-world model, there is no trusted party and all the parties can do is to exchange messages with each other. A protocol is said to be secure if one can learn no more about each party's private inputs in the real world than one could learn in the ideal world. In the ideal world, no messages are exchanged between parties, so real-world exchanged messages cannot reveal any secret information.

The Real World/Ideal World Paradigm provides a simple abstraction of the complexities of MPC to allow the construction of an application under the pretense that the MPC protocol at its core is actually an ideal execution. If the application is secure in the ideal case, then it is also secure when a real protocol is run instead.

The security requirements on an MPC protocol are stringent. Nonetheless, in 1987 it was demonstrated that any function can be securely computed, with security for malicious adversaries and the other initial works mentioned before. Despite these publications, MPC was not designed to be efficient enough to be used in practice at that time. Unconditionally or information-theoretically secure MPC is closely related and builds on to the problem of secret sharing, and more specifically verifiable secret sharing (VSS), which many secure MPC protocols use against active adversaries.

Unlike traditional cryptographic applications, such as encryption or signature, one must assume that the adversary in an MPC protocol is one of the players engaged in the system (or controlling internal parties). That corrupted party or parties may collude in order to breach the security of the protocol. Let $$n$$ be the number of parties in the protocol and $$t$$ the number of parties who can be adversarial. The protocols and solutions for the case of $$t < n/2$$ (i.e., when an honest majority is assumed) are different from those where no such assumption is made. This latter case includes the important case of two-party computation where one of the participants may be corrupted, and the general case where an unlimited number of participants are corrupted and collude to attack the honest participants.

Adversaries faced by the different protocols can be categorized according to how willing they are to deviate from the protocol. There are essentially two types of adversaries, each giving rise to different forms of security (and each fits into different real world scenario):
 * Semi-Honest (Passive) Security: In this case, it is assumed that corrupted parties merely cooperate to gather information out of the protocol, but do not deviate from the protocol specification. This is a naive adversary model, yielding weak security in real situations. However, protocols achieving this level of security prevent inadvertent leakage of information between (otherwise collaborating) parties, and are thus useful if this is the only concern. In addition, protocols in the semi-honest model are quite efficient, and are often an important first step for achieving higher levels of security.
 * Malicious (Active) Security: In this case, the adversary may arbitrarily deviate from the protocol execution in its attempt to cheat. Protocols that achieve security in this model provide a very high security guarantee. In the case of majority of misbehaving parties: The only thing that an adversary can do in the case of dishonest majority is to cause the honest parties to "abort" having detected cheating. If the honest parties do obtain output, then they are guaranteed that it is correct. Their privacy is always preserved.

Security against active adversaries typically leads to a reduction in efficiency. Covert security is an alternative that aims to allow greater efficiency in exchange for weakening the security definition; it is applicable to situations where active adversaries are willing to cheat but only if they are not caught. For example, their reputation could be damaged, preventing future collaboration with other honest parties. Thus, protocols that are covertly secure provide mechanisms to ensure that, if some of the parties do not follow the instructions, then it will be noticed with high probability, say 75% or 90%. In a way, covert adversaries are active ones forced to act passively due to external non-cryptographic (e.g. business) concerns. This mechanism sets a bridge between both models in the hope of finding protocols which are efficient and secure enough in practice.

Like many cryptographic protocols, the security of an MPC protocol can rely on different assumptions:
 * It can be computational (i.e. based on some mathematical problem, like factoring) or unconditional, namely relying on physical unavailability of messages on channels (usually with some probability of error which can be made arbitrarily small).
 * The model might assume that participants use a synchronized network, where a message sent at a "tick" always arrives at the next "tick", or that a secure and reliable broadcast channel exists, or that a secure communication channel exists between every pair of participants where an adversary cannot read, modify or generate messages in the channel, etc.

The set of honest parties that can execute a computational task is related to the concept of access structure. Adversary structures can be static, where the adversary chooses its victims before the start of the multi-party computation, or dynamic, where it chooses its victims during the course of execution of the multi-party computation making the defense harder. An adversary structure can be defined as a threshold structure or as a more complex structure. In a threshold structure the adversary can corrupt or read the memory of a number of participants up to some threshold. Meanwhile, in a complex structure it can affect certain predefined subsets of participants, modeling different possible collusions.

Protocols
There are major differences between the protocols proposed for two party computation (2PC) and multi-party computation (MPC). Also, often for special purpose protocols of importance a specialized protocol that deviates from the generic ones has to be designed (voting, auctions, payments, etc.)

Two-party computation
The two party setting is particularly interesting, not only from an applications perspective but also because special techniques can be applied in the two party setting which do not apply in the multi-party case. Indeed, secure multi-party computation (in fact the restricted case of secure function evaluation, where only a single function is evaluated) was first presented in the two-party setting. The original work is often cited as being from one of the two papers of Yao; although the papers do not actually contain what is now known as Yao's garbled circuit protocol.

Yao's basic protocol is secure against semi-honest adversaries and is extremely efficient in terms of number of rounds, which is constant, and independent of the target function being evaluated. The function is viewed as a Boolean circuit, with inputs in binary of fixed length. A Boolean circuit is a collection of gates connected with three different types of wires: circuit-input wires, circuit-output wires and intermediate wires. Each gate receives two input wires and it has a single output wire which might be fan-out (i.e. be passed to multiple gates at the next level). Plain evaluation of the circuit is done by evaluating each gate in turn; assuming the gates have been topologically ordered. The gate is represented as a truth table such that for each possible pair of bits (those coming from the input wires' gate) the table assigns a unique output bit; which is the value of the output wire of the gate. The results of the evaluation are the bits obtained in the circuit-output wires.

Yao explained how to garble a circuit (hide its structure) so that two parties, sender and receiver, can learn the output of the circuit and nothing else. At a high level, the sender prepares the garbled circuit and sends it to the receiver, who obliviously evaluates the circuit, learning the encodings corresponding to both his and the sender's output. He then just sends back the sender's encodings, allowing the sender to compute his part of the output. The sender sends the mapping from the receivers output encodings to bits to the receiver, allowing the receiver to obtain their output.

In more detail, the garbled circuit is computed as follows. The main ingredient is a double-keyed symmetric encryption scheme. Given a gate of the circuit, each possible value of its input wires (either 0 or 1) is encoded with a random number (label). The values resulting from the evaluation of the gate at each of the four possible pair of input bits are also replaced with random labels. The garbled truth table of the gate consists of encryptions of each output label using its inputs labels as keys. The position of these four encryptions in the truth table is randomized so no information on the gate is leaked.

To correctly evaluate each garbled gate the encryption scheme has the following two properties. Firstly, the ranges of the encryption function under any two distinct keys are disjoint (with overwhelming probability). The second property says that it can be checked efficiently whether a given ciphertext has been encrypted under a given key. With these two properties the receiver, after obtaining the labels for all circuit-input wires, can evaluate each gate by first finding out which of the four ciphertexts has been encrypted with his label keys, and then decrypting to obtain the label of the output wire. This is done obliviously as all the receiver learns during the evaluation are encodings of the bits.

The sender's (i.e. circuit creators) input bits can be just sent as encodings to the evaluator; whereas the receiver's (i.e. circuit evaluators) encodings corresponding to his input bits are obtained via a 1-out-of-2 oblivious transfer (OT) protocol. A 1-out-of-2 OT protocol enables the sender possessing of two values C1 and C2 to send the one requested by the receiver (b a value in {1,2}) in such a way that the sender does not know what value has been transferred, and the receiver only learns the queried value.

If one is considering malicious adversaries, further mechanisms to ensure correct behavior of both parties need to be provided. By construction it is easy to show security for the sender if the OT protocol is already secure against malicious adversary, as all the receiver can do is to evaluate a garbled circuit that would fail to reach the circuit-output wires if he deviated from the instructions. The situation is very different on the sender's side. For example, he may send an incorrect garbled circuit that computes a function revealing the receiver's input. This would mean that privacy no longer holds, but since the circuit is garbled the receiver would not be able to detect this. However, it is possible to efficiently apply Zero-Knowledge proofs to make this protocol secure against malicious adversaries with a small overhead comparing to the semi-honest protocol.

Multi-party protocols
Most MPC protocols, as opposed to 2PC protocols and especially under the unconditional setting of private channels, make use of secret sharing. In the secret sharing based methods, the parties do not play special roles (as in Yao, of creator and evaluator). Instead, the data associated with each wire is shared amongst the parties, and a protocol is then used to evaluate each gate. The function is now defined as a "circuit" over a finite field, as opposed to the binary circuits used for Yao. Such a circuit is called an arithmetic circuit in the literature, and it consists of addition and multiplication "gates" where the values operated on are defined over a finite field.

Secret sharing allows one to distribute a secret among a number of parties by distributing shares to each party. Two types of secret sharing schemes are commonly used; Shamir secret sharing and additive secret sharing. In both cases the shares are random elements of a finite field that add up to the secret in the field; intuitively, security is achieved because any non-qualifying set of shares looks randomly distributed.

Secret sharing schemes can tolerate an adversary controlling up to t parties out of n total parties, where t varies based on the scheme, the adversary can be passive or active, and different assumptions are made on the power of the adversary. The Shamir secret sharing scheme is secure against a passive adversary when $$t < \frac{n}{2}$$ and an active adversary when $$t < \frac{n}{3}$$ while achieving information-theoretic security, meaning that even if the adversary has unbounded computational power, they cannot learn any information about the secret underlying a share. The BGW protocol, which defines how to compute addition and multiplication on secret shares, is often used to compute functions with Shamir secret shares. Additive secret sharing schemes can tolerate the adversary controlling all but one party, that is $$t < n$$, while maintaining security against a passive and active adversary with unbounded computational power. Some protocols require a setup phase, which may only be secure against a computationally bounded adversary.

A number of systems have implemented various forms of MPC with secret sharing schemes. The most popular is SPDZ, which implements MPC with additive secret shares and is secure against active adversaries.

Other protocols
In 2014 a "model of fairness in secure computation in which an adversarial party that aborts on receiving output is forced to pay a mutually predefined monetary penalty" has been described for the Bitcoin network or for fair lottery, and has been successfully implemented in Ethereum.

Practical MPC systems
Many advances have been made on 2PC and MPC systems in recent years.

Yao-based protocols
One of the main issues when working with Yao-based protocols is that the function to be securely evaluated (which could be an arbitrary program) must be represented as a circuit, usually consisting of XOR and AND gates. Since most real-world programs contain loops and complex data structures, this is a highly non-trivial task. The Fairplay system was the first tool designed to tackle this problem. Fairplay comprises two main components. The first of these is a compiler enabling users to write programs in a simple high-level language, and output these programs in a Boolean circuit representation. The second component can then garble the circuit and execute a protocol to securely evaluate the garbled circuit. As well as two-party computation based on Yao's protocol, Fairplay can also carry out multi-party protocols. This is done using the BMR protocol, which extends Yao's passively secure protocol to the active case.

In the years following the introduction of Fairplay, many improvements to Yao's basic protocol have been created, in the form of both efficiency improvements and techniques for active security. These include techniques such as the free XOR method, which allows for much simpler evaluation of XOR gates, and garbled row reduction, reducing the size of garbled tables with two inputs by 25%.

The approach that so far seems to be the most fruitful in obtaining active security comes from a combination of the garbling technique and the "cut-and-choose" paradigm. This combination seems to render more efficient constructions. To avoid the aforementioned problems with respect to dishonest behaviour, many garblings of the same circuit are sent from the constructor to the evaluator. Then around half of them (depending on the specific protocol) are opened to check consistency, and if so a vast majority of the unopened ones are correct with high probability. The output is the majority vote of all the evaluations. Here the majority output is needed. If there is disagreement on the outputs the receiver knows the sender is cheating, but he cannot complain as otherwise this would leak information on his input.

This approach for active security was initiated by Lindell and Pinkas. This technique was implemented by Pinkas et al. in 2009, This provided the first actively secure two-party evaluation of the Advanced Encryption Standard (AES) circuit, regarded as a highly complex (consisting of around 30,000 AND and XOR gates), non-trivial function (also with some potential applications), taking around 20 minutes to compute and requiring 160 circuits to obtain a $$2^{-40}$$ cheating probability.

As many circuits are evaluated, the parties (including the receiver) need to commit to their inputs to ensure that in all the iterations the same values are used. The experiments of Pinkas et al. reported show that the bottleneck of the protocol lies in the consistency checks. They had to send over the net about 6,553,600 commitments to various values to evaluate the AES circuit. In recent results the efficiency of actively secure Yao-based implementations was improved even further, requiring only 40 circuits, and a much smaller number of commitments, to obtain $$2^{-40}$$ cheating probability. The improvements come from new methodologies for performing cut-and-choose on the transmitted circuits.

More recently, there has been a focus on highly parallel implementations based on garbled circuits, designed to be run on CPUs with many cores. Kreuter, et al. describe an implementation running on 512 cores of a powerful cluster computer. Using these resources they could evaluate the 4095-bit edit distance function, whose circuit comprises almost 6 billion gates. To accomplish this they developed a custom, better optimized circuit compiler than Fairplay and several new optimizations such as pipelining, whereby transmission of the garbled circuit across the network begins while the rest of the circuit is still being generated. The time to compute AES was reduced to 1.4 seconds per block in the active case, using a 512-node cluster machine, and 115 seconds using one node. Shelat and Shen improve this, using commodity hardware, to 0.52 seconds per block. The same paper reports on a throughput of 21 blocks per second, but with a latency of 48 seconds per block.

Meanwhile, another group of researchers has investigated using consumer-grade GPUs to achieve similar levels of parallelism. They utilize oblivious transfer extensions and some other novel techniques to design their GPU-specific protocol. This approach seems to achieve comparable efficiency to the cluster computing implementation, using a similar number of cores. However, the authors only report on an implementation of the AES circuit, which has around 50,000 gates. On the other hand, the hardware required here is far more accessible, as similar devices may already be found in many people's desktop computers or games consoles. The authors obtain a timing of 2.7 seconds per AES block on a standard desktop, with a standard GPU. If they allow security to decrease to something akin to covert security, they obtain a run time of 0.30 seconds per AES block. In the passive security case there are reports of processing of circuits with 250 million gates, and at a rate of 75 million gates per second.

Implementations of secure multi-party computation data analyses
One of the primary applications of secure multi-party computation is allowing analysis of data that is held by multiple parties, or blind analysis of data by third parties without allowing the data custodian to understand the kind of data analysis being performed.