Signaling game

In game theory, a signaling game is a simple type of a dynamic Bayesian game.

The essence of a signalling game is that one player takes an action, the signal, to convey information to another player, where sending the signal is more costly if they are conveying false information. A manufacturer, for example, might provide a warranty for its product in order to signal to consumers that its product is unlikely to break down. The classic example is of a worker who acquires a college degree not because it increases their skill, but because it conveys their ability to employers.

A simple signalling game would have two players, the sender and the receiver. The sender has one of two types that might be called "desirable" and "undesirable" with different payoff functions, where the receiver knows the probability of each type but not which one this particular sender has. The receiver has just one possible type. The sender moves first, choosing an action called the "signal" or "message" (though the term "message" is more often used in non-signalling "cheap talk" games where sending messages is costless). The receiver moves second, after observing the signal.

The two players receive payoffs dependent on the sender's type, the message chosen by the sender and the action chosen by the receiver.

The tension in the game is that the sender wants to persuade the receiver that they have the desirable type, and they will try to choose a signal to do that. Whether this succeeds depends on whether the undesirable type would send the same signal, and how the receiver interprets the signal.

Perfect Bayesian equilibrium
The equilibrium concept that is relevant for signaling games is the perfect Bayesian equilibrium, a refinement of Bayesian Nash equilibrium.

Nature chooses the sender to have type $$ t $$ with probability $$p $$. The sender then chooses the probability with which to  take signalling action $$m $$, which can be written as $$Prob(m|t)$$  for each possible $$t. $$ The receiver observes the signal $$m $$ but not $$t$$, and chooses the probability with which to take response action $$ a  $$, which can be written as $$Prob(a|m)$$ for each possible $$m. $$ The sender's payoff is $$u(a, m, t)$$ and the receiver's is $$v(a,t).$$

A perfect Bayesian equilibrium is a combination of beliefs and strategies for each player. Both players believe that the other will follow the strategies specified in the equilibrium, as in simple Nash equilibrium, unless they observe something that has probability zero in the equilibrium. The receiver's beliefs also include a probability distribution $$ b(t|m) $$ representing the probability  put on the  sender having type $$t$$ if  the receiver observes  signal $$m$$. The receiver's strategy is a choice of $$ Prob(a|m).$$ The sender's strategy is a choice of $$ Prob(m|t)$$. These beliefs and strategies must satisfy certain conditions:


 * Sequential rationality: each strategy should maximize a player's expected utility, given their beliefs.
 * Consistency: each belief should be updated according to the equilibrium strategies, the observed actions, and Bayes' rule on every path reached in equilibrium with positive probability. On paths of zero probability, known as "off-equilibrium paths", the beliefs must be specified but can be arbitrary.

The kinds of perfect Bayesian equilibria  that may arise  can be divided in three different categories: pooling equilibria, separating equilibria and semi-separating. A given game may or may not have more than one equilibrium. If there are more types of senders than there are messages, the equilibrium can never be a separating equilibrium (but may be semi-separating). There are also hybrid equilibria, in which the sender randomizes between pooling and separating.
 * In a pooling equilibrium, senders of different types all choose the same signal. This means that the signal does not give any information to the receiver, so the receiver's beliefs are not updated after seeing the signal.
 * In a separating equilibrium, senders of different types always choose different signals. This means that the signal always reveals the sender's type, so the receiver's beliefs become deterministic after seeing the signal.
 * In a semi-separating equilibrium (also called partial-pooling),   some types of senders choose the same message and other types choose different messages.

Reputation game
In this game, the sender and the receiver are firms. The sender is an incumbent firm and the receiver is an entrant firm. The payoffs are given by the table at the right. It is assumed that:
 * The sender can be one of two types: sane or crazy. A sane sender can send one of two messages: prey and accommodate. A crazy sender can only prey.
 * The receiver can do one of two actions: stay or exit.
 * M1>D1>P1, i.e., a sane sender prefers to be a monopoly (M1), but if it is not a monopoly, it prefers to accommodate (D1) than to prey (P1). The value of X1 is irrelevant since a crazy firm has only one possible action.
 * D2>0>P2, i.e., the receiver prefers to stay in a market with a sane competitor (D2) than to exit the market (0), but prefers to exit than to stay in a market with a crazy competitor (P2).
 * A priori, the sender has probability p to be sane and 1-p to be crazy.

We now look for perfect Bayesian equilibria. It is convenient to differentiate between separating equilibria and pooling equilibria.
 * A separating equilibrium, in our case, is one in which the sane sender always accommodates. This separates it from a crazy sender. In the second period, the receiver has complete information: their beliefs are "If accommodate then the sender is sane, otherwise the sender is crazy". Their best-response is: "If accommodate then stay, if prey then exit". The payoff of the sender when they accommodate is D1+D1, but if they deviate to prey their payoff changes to P1+M1; therefore, a necessary condition for a separating equilibrium is D1+D1≥P1+M1 (i.e., the cost of preying overrides the gain from being a monopoly). It is possible to show that this condition is also sufficient.
 * A pooling equilibrium is one in which the sane sender always preys. In the second period, the receiver has no new information. If the sender preys, then the receiver's beliefs must be equal to the apriori beliefs, which are, the sender is sane with probability p and crazy with probability 1-p. Therefore, the receiver's expected payoff from staying is: [p D2 + (1-p) P2]; the receiver stays if-and-only-if this expression is positive. The sender can gain from preying, only if the receiver exits. Therefore, a necessary condition for a pooling equilibrium is p D2 + (1-p) P2 ≤ 0 (intuitively, the receiver is careful and will not enter the market if there is a risk that the sender is crazy. The sender knows this, and thus hides their true identity by always preying like a crazy). But this condition is not sufficient: if the receiver exits also after accommodate, then it is better for the sender to accommodate, since it is cheaper than Prey. So it is necessary that the receiver stays after accommodate, and it is necessary that D1+D1<P1+M1 (i.e., the gain from being a monopoly overrides the cost of preying). Finally, we must make sure that staying after accommodate is a best-response for the receiver. For this, the receiver's beliefs must be specified after accommodate. This path has probability 0, so Bayes' rule does not apply, and we are free to choose the receiver's beliefs as e.g. "If accommodate then the sender is sane".

Summary:
 * If preying is costly for a sane sender (D1+D1≥P1+M1), they will accommodate and there will be a unique separating PBE: the receiver will stay after accommodate and exit after prey.
 * If preying is not too costly for a sane sender (D1+D1<P1+M1), and it is harmful for the receiver (p D2 + (1-p) P2 ≤ 0), the sender will prey and there will be a unique pooling PBE: again the receiver will stay after accommodate and exit after prey. Here, the sender is willing to lose some value by preying in the first period, in order to build a reputation of a predatory firm, and convince the receiver to exit.
 * If preying is not costly for the sender nor harmful for the receiver, there will not be a PBE in pure strategies. There will be a unique PBE in mixed strategies - both the sender and the receiver will randomize between their two actions.

Education game
Michael Spence's 1973 paper on education as a signal of ability is the start of the economic analysis of signalling. In this game, the senders are workers and the receivers are employers. The example below has two types of workers and a continuous signal level.

The players are a worker and two firms. The worker chooses an education level $$s,$$ the signal, after which the firms simultaneously offer him a wage $$w_1$$ and $$w_2$$ and he accepts one or the other. The worker's type, known only to himself, is either high ability with $$a=10$$ or low ability with $$a = 0,$$  each type having probability 1/2. The high-ability worker's payoff is $$U_H= w - s$$ and the low-ability's is $$U_{L}= w - 2s.$$ A firm that hires the worker at wage $$w$$ has payoff $$a-w$$ and the other firm has payoff 0.

In this game, the firms compete the wage down to where it equals the expected ability, so if there is no signal possible, the result would be $$w_1=w_2 = .5(10) + .5 (0) =5.$$ This will also be the wage in a pooling equilibrium, one where both types of worker choose the same signal, so the firms are left using their prior belief of .5 for the probability he has High ability. In a separating equilibrium, the wage will be 0 for the signal level the Low type chooses and 10 for the high type's signal. There are many equilibria, both pooling and separating, depending on expectations.

In a separating equilibrium, the low type chooses $$s=0.$$ The wages will be $$w(s=0)=0$$ and $$w(s=s^*) =10$$ for some critical level $$s^*$$ that signals high ability. For the low type to choose $$s = 0$$ requires that  $$U_L (s = 0) \geq U_L(s=s^*),$$ so $$ 0 \geq 10-2s^*$$  and we can conclude that $$s^* \geq 5.$$ For the high type to choose $$s = s^*$$ requires that  $$U_H (s = s^*) \geq U_H(s=0),$$ so  $$10-s \geq 0$$ and we can conclude that $$s^* \leq 10.$$ Thus, any value of $$s^*$$ between 5 and 10 can support an equilibrium. Perfect Bayesian equilibrium requires an out-of-equilibrium belief to be specified too, for all the other possible levels of $$s$$ besides 0 and $$s^*,$$ levels which are "impossible" in equilibrium since neither type plays them. These beliefs must be such that neither player would want to deviate from his equilibrium strategy 0 or $$s^*$$ to a different $$s.$$ A convenient belief is that $$Prob(a = High) =0$$ if $$s \neq s^*;$$ another, more realistic, belief that would support an equilibrium is $$Prob(a = High) = 0$$ if $$s < s^*$$ and $$Prob(a = High) = 1$$ if $$s \geq s^*$$. There is a continuum of equilibria, for each possible level of $$s^*.$$ One equilibrium, for example, is
 * $$s|Low = 0, s|High= 7, w|(s=7) = 10, w|(s \neq 7)  = 0, Prob(a=High|s=7) = 1, Prob(a=High|s \neq 7) =0. $$

In a pooling equilibrium, both types choose the same $$s.$$ One pooling equilibrium is for both types to choose $$s=0,$$ no education, with the out-of-equilibrium belief $$Prob(a=High|s>0) = .5.$$ In that case, the wage will be the expected ability of 5, and neither type of worker will deviate to a higher education level because the firms would not think that told them anything about the worker's type.

The most surprising result is that there are also pooling equilibria with $$s = s'>0.$$ Suppose we specify the out-of-equilibrium belief to be $$Prob(a=High|s< s') = 0.$$ Then the wage will be 5 for a worker with $$s= s',$$ but 0 for a worker with wage $$s = 0.$$ The low type compares the payoffs $$U_L(s=s') = 5 - 2s'$$ to $$U_L(s=0) =0,$$ and if $$s'\leq 2.5,$$ he is willing to follow his equilibrium strategy of $$s=s'.$$ The high  type will choose $$s=s'$$ a fortiori. Thus, there is another continuum of equilibria, with values of $$s'$$ in [0, 2.5].

In the signalling model of education, expectations are crucial. If, as in the separating equilibrium, employers expect that high-ability people will acquire a certain level of education and low-ability ones  will not, we get the main insight: that if  people cannot communicate their ability directly, they will acquire educations even if  it does not increase productivity, just to demonstrate ability. Or, in the pooling equilibrium with $$s=0,$$ if employers do not think education signals anything, we can get the outcome that nobody becomes educated. Or, in the pooling equilibrium with $$s>0,$$ everyone acquires education that is completely useless, not even showing who has high ability, out of fear that if they deviate and do not acquire  education, employers will think they have low ability.

Beer-Quiche game
The Beer-Quiche game of Cho and Kreps draws on the stereotype of quiche eaters being less masculine. In this game, an individual B is considering whether to duel with another individual A. B knows that A is either a wimp or is surly but not which. B would prefer a duel if A is a wimp but not if A is surly. Player A, regardless of type, wants to avoid a duel. Before making the decision B has the opportunity to see whether A chooses to have beer or quiche for breakfast. Both players know that wimps prefer quiche while surlies prefer beer. The point of the game is to analyze the choice of breakfast by each kind of A. This has become a standard example of a signaling game. See for more details.

Applications of signaling games
Signaling games describe situations where one player has information the other player does not have. These situations of asymmetric information are very common in economics and behavioral biology.

Philosophy
The first signaling game was the Lewis signaling game, which occurred in David K. Lewis' Ph. D. dissertation (and later book) Convention. See Replying to W.V.O. Quine,  Lewis attempts to develop a theory of convention and meaning using signaling games. In his most extreme comments, he suggests that understanding the equilibrium properties of the appropriate signaling game captures all there is to know about meaning:


 * I have now described the character of a case of signaling without mentioning the meaning of the signals: that two lanterns meant that the redcoats were coming by sea, or whatever. But nothing important seems to have been left unsaid, so what has been said must somehow imply that the signals have their meanings.

The use of signaling games has been continued in the philosophical literature. Others have used evolutionary models of signaling games to describe the emergence of language. Work on the emergence of language in simple signaling games includes models by Huttegger, Grim, et al., Skyrms, and Zollman. Harms, and Huttegger, have attempted to extend the study to include the distinction between normative and descriptive language.

Economics
The first application of signaling games to economic problems was Michael Spence's Education game. A second application was the Reputation game.

Biology
Valuable advances have been made by applying signaling games to a number of biological questions. Most notably, Alan Grafen's (1990) handicap model of mate attraction displays. The antlers of stags, the elaborate plumage of peacocks and bird-of-paradise, and the song of the nightingale are all such signals. Grafen's analysis of biological signaling is formally similar to the classic monograph on economic market signaling by Michael Spence. More recently, a series of papers by Getty  shows that Grafen's analysis, like that of Spence, is based on the critical simplifying assumption that signalers trade off costs for benefits in an additive fashion, the way humans invest money to increase income in the same currency. This assumption that costs and benefits trade off in an additive fashion might be valid for some biological signaling systems, but is not valid for multiplicative tradeoffs, such as the survival cost – reproduction benefit tradeoff that is assumed to mediate the evolution of sexually selected signals.

Charles Godfray (1991) modeled the begging behavior of nestling birds as a signaling game. The nestlings begging not only informs the parents that the nestling is hungry, but also attracts predators to the nest. The parents and nestlings are in conflict. The nestlings benefit if the parents work harder to feed them than the parents ultimate benefit level of investment. The parents are trading off investment in the current nestlings against investment in future offspring.

Pursuit deterrent signals have been modeled as signaling games. Thompson's gazelles are known sometimes to perform a 'stott', a jump into the air of several feet with the white tail showing, when they detect a predator. Alcock and others have suggested that this action is a signal of the gazelle's speed to the predator. This action successfully distinguishes types because it would be impossible or too costly for a sick creature to perform and hence the predator is deterred from chasing a stotting gazelle because it is obviously very agile and would prove hard to catch.

The concept of information asymmetry in molecular biology has long been apparent. Although molecules are not rational agents, simulations have shown that through replication, selection, and genetic drift, molecules can behave according to signaling game dynamics. Such models have been proposed to explain, for example, the emergence of the genetic code from an RNA and amino acid world.

Costly versus cost-free signaling
One of the major uses of signaling games both in economics and biology has been to determine under what conditions honest signaling can be an equilibrium of the game. That is, under what conditions can we expect rational people or animals subject to natural selection to reveal information about their types?

If both parties have coinciding interest, that is they both prefer the same outcomes in all situations, then honesty is an equilibrium. (Although in most of these cases non-communicative equilibria exist as well.) However, if the parties' interests do not perfectly overlap, then the maintenance of informative signaling systems raises an important problem.

Consider a circumstance described by John Maynard Smith regarding transfer between related individuals. Suppose a signaler can be either starving or just hungry, and they can signal that fact to another individual who has food. Suppose that they would like more food regardless of their state, but that the individual with food only wants to give them the food if they are starving. While both players have identical interests when the signaler is starving, they have opposing interests when the signaler is only hungry. When they are only hungry, they have an incentive to lie about their need in order to obtain the food. And if the signaler regularly lies, then the receiver should ignore the signal and do whatever they think is best.

Determining how signaling is stable in these situations has concerned both economists and biologists, and both have independently suggested that signal cost might play a role. If sending one signal is costly, it might only be worth the cost for the starving person to signal. The analysis of when costs are necessary to sustain honesty has been a significant area of research in both these fields.