Gilbert–Shannon–Reeds model

In the mathematics of shuffling playing cards, the Gilbert–Shannon–Reeds model is a probability distribution on riffle shuffle permutations. It forms the basis for a recommendation that a deck of cards should be riffled seven times in order to thoroughly randomize it. It is named after the work of Edgar Gilbert, Claude Shannon, and J. Reeds, reported in a 1955 technical report by Gilbert and in a 1981 unpublished manuscript of Reeds.

The model
A riffle shuffle permutation of a sequence of elements is obtained by partitioning the elements into two contiguous subsequences, and then arbitrarily interleaving the two subsequences. For instance, this describes many common ways of shuffling a deck of playing cards, by cutting the deck into two piles of cards that are then riffled together. The Gilbert–Shannon–Reeds model assigns a probability to each of these permutations. In this way, it describes the probability of obtaining each permutation, when a shuffle is performed at random. The model may be defined in several equivalent ways, describing alternative ways of performing this random shuffle:


 * Most similarly to the way humans shuffle cards, the Gilbert–Shannon–Reeds model describes the probabilities obtained from a certain mathematical model of randomly cutting and then riffling a deck of cards. First, the deck is cut into two packets. If there are a total of $$n$$ cards, then the probability of selecting $$k$$ cards in the first deck and $$n-k$$ in the second deck is defined as $$\tbinom{n}{k}/2^n$$. Then, one card at a time is repeatedly moved from the bottom of one of the packets to the top of the shuffled deck, such that if $$x$$ cards remain in one packet and $$y$$ cards remain in the other packet, then the probability of choosing a card from the first packet is $$x/(x+y)$$ and the probability of choosing a card from the second packet is $$y/(x+y)$$.
 * A second, alternative description can be based on a property of the model, that it generates a permutation of the initial deck in which each card is equally likely to have come from the first or the second packet. To generate a random permutation according to this model, begin by flipping a fair coin $$n$$ times, to determine for each position of the shuffled deck whether it comes from the first packet or the second packet. Then split into two packets whose sizes are the number of tails and the number of heads flipped, and use the same coin flip sequence to determine from which packet to pull each card of the shuffled deck.
 * A third alternative description is more abstract, but lends itself better to mathematical analysis. Generate a set of $$n$$ values from the uniform continuous distribution on the unit interval, and place them in sorted order. Then the doubling map $$x\mapsto 2x\pmod{1}$$ from the theory of dynamical systems maps this system of points to a permutation of the points in which the permuted ordering obeys the Gilbert–Shannon–Reeds model, and the positions of the new points are again uniformly random.

Among all of the possible riffle shuffle permutations of a card deck, the Gilbert–Shannon–Reeds model gives almost all riffles equal probability, $$1/2^n$$, of occurring. However, there is one exception, the identity permutation, which has a greater probability $$(n+1)/2^n$$ of occurring.

Inverse
The inverse permutation of a random riffle may be generated directly. To do so, start with a deck of n cards and then repeatedly deal off the bottom card of the deck onto one of two piles, choosing randomly with equal probability which of the two piles to deal each card onto. Then, when all cards have been dealt, stack the two piles back together.

The effect of repeated riffles
analyzed mathematically the total variation distance between two probability distributions on permutations: the uniform distribution in which all permutations are equally likely, and the distribution generated by repeated applications of the Gilbert–Shannon–Reeds model. The total variation distance measures how similar or dissimilar two probability distributions are; it is zero only when the two distributions are identical, and attains a maximum value of one for probability distributions that never generate the same values as each other. Bayer and Diaconis reported that, for decks of n cards shuffled $$\tfrac{3}{2}\log_2 n+\theta$$ times, where &theta; is an arbitrary constant, the total variation distance is close to one when &theta; is significantly less than zero, and close to zero when &theta; is significantly greater than zero, independently of n. In particular their calculations showed that for n = 52, five riffles produce a distribution whose total variation distance from uniform is still close to one, while seven riffles give total variation distance 0.334. This result was widely reported as implying that card decks should be riffled seven times in order to thoroughly randomize them.

Similar analyses have been performed using the Kullback–Leibler divergence, a distance between two probability distributions defined in terms of entropy; the divergence of a distribution from uniform can be interpreted as the number of bits of information that can still be recovered about the initial state of the card deck. The results are qualitatively different: rather than having a sharp threshold between random and non-random at $$\tfrac{3}{2}\log_2 n$$ shuffles, as occurs for total variation distance, the divergence decays more gradually, decreasing linearly as the number of shuffles ranges from zero to $$\log_2 n$$ (at which point the number of remaining bits of information is linear, smaller by a logarithmic factor than its initial value) and then decreasing exponentially until, after $$\tfrac{3}{2}\log_2 n$$ shuffles, only a constant number of bits of information remain.