Combinatorial number system

In mathematics, and in particular in combinatorics, the combinatorial number system of degree k (for some positive integer k), also referred to as combinadics, or the Macaulay representation of an integer, is a correspondence between natural numbers (taken to include 0) N and k-combinations. The combinations are represented as strictly decreasing sequences ck &gt; ... &gt; c2 &gt; c1 ≥ 0 where each ci corresponds to the index of a chosen element in a given k-combination. Distinct numbers correspond to distinct k-combinations, and produce them in lexicographic order. The numbers less than $$\tbinom nk$$ correspond to all k-combinations of {0, 1, ..., n − 1}. The correspondence does not depend on the size n of the set that the k-combinations are taken from, so it can be interpreted as a map from N to the k-combinations taken from N; in this view the correspondence is a bijection.

The number N corresponding to (ck, ..., c2, c1) is given by


 * $$N=\binom{c_k}k+\cdots+\binom{c_2}2+\binom{c_1}1$$.

The fact that a unique sequence corresponds to any non-negative number N was first observed by D. H. Lehmer. Indeed, a greedy algorithm finds the k-combination corresponding to N: take ck maximal with $$\tbinom{c_k}k\leq N$$, then take ck−1 maximal with $$\tbinom{c_{k-1}}{k-1}\leq N - \tbinom{c_k}k$$, and so forth. Finding the number N, using the formula above, from the k-combination (ck, ..., c2, c1) is also known as "ranking", and the opposite operation (given by the greedy algorithm) as "unranking"; the operations are known by these names in most computer algebra systems, and in computational mathematics.

The originally used term "combinatorial representation of integers" was shortened to "combinatorial number system" by Knuth, who also gives a much older reference; the term "combinadic" is introduced by James McCaffrey (without reference to previous terminology or work).

Unlike the factorial number system, the combinatorial number system of degree k is not a mixed radix system: the part $$\tbinom{c_i}i$$ of the number N represented by a "digit" ci is not obtained from it by simply multiplying by a place value.

The main application of the combinatorial number system is that it allows rapid computation of the k-combination that is at a given position in the lexicographic ordering, without having to explicitly list the k-combinations preceding it; this allows for instance random generation of k-combinations of a given set. Enumeration of k-combinations has many applications, among which are software testing, sampling, quality control, and the analysis of lottery games.

Ordering combinations
A k-combination of a set S is a subset of S with k (distinct) elements. The main purpose of the combinatorial number system is to provide a representation, each by a single number, of all $$\tbinom nk$$ possible k-combinations of a set S of n elements. Choosing, for any n, {0, 1, ..., n − 1} as such a set, it can be arranged that the representation of a given k-combination C is independent of the value of n (although n must of course be sufficiently large); in other words considering C as a subset of a larger set by increasing n will not change the number that represents C. Thus for the combinatorial number system one just considers C as a k-combination of the set N of all natural numbers, without explicitly mentioning n.

In order to ensure that the numbers representing the k-combinations of {0, 1, ..., n − 1} are less than those representing k-combinations not contained in {0, 1, ..., n − 1}, the k-combinations must be ordered in such a way that their largest elements are compared first. The most natural ordering that has this property is lexicographic ordering of the decreasing sequence of their elements. So comparing the 5-combinations C = {0,3,4,6,9} and C′ = {0,1,3,7,9}, one has that C comes before C′, since they have the same largest part 9, but the next largest part 6 of C is less than the next largest part 7 of C′; the sequences compared lexicographically are (9,6,4,3,0) and (9,7,3,1,0).

Another way to describe this ordering is view combinations as describing the k raised bits in the binary representation of a number, so that C = {c1, ..., ck} describes the number
 * $$2^{c_1}+2^{c_2}+\cdots+2^{c_k}$$

(this associates distinct numbers to all finite sets of natural numbers); then comparison of k-combinations can be done by comparing the associated binary numbers. In the example C and C′ correspond to numbers 10010110012 = 60110 and 10100010112 = 65110, which again shows that C comes before C′. This number is not however the one one wants to represent the k-combination with, since many binary numbers have a number of raised bits different from k; one wants to find the relative position of C in the ordered list of (only) k-combinations.

Place of a combination in the ordering
The number associated in the combinatorial number system of degree k to a k-combination C is the number of k-combinations strictly less than C in the given ordering. This number can be computed from C = {ck, ..., c2, c1} with ck &gt; ... &gt; c2 &gt; c1 as follows.

From the definition of the ordering it follows that for each k-combination S strictly less than C, there is a unique index i such that ci is absent from S, while ck, ..., ci+1 are present in S, and no other value larger than ci is. One can therefore group those k-combinations S according to the possible values 1, 2, ..., k of i, and count each group separately. For a given value of i one must include ck, ..., ci+1 in S, and the remaining i elements of S must be chosen from the ci non-negative integers strictly less than ci; moreover any such choice will result in a k-combinations S strictly less than C. The number of possible choices is $$\tbinom{c_i}i$$, which is therefore the number of combinations in group i; the total number of k-combinations strictly less than C then is
 * $$\binom{c_1}1+\binom{c_2}2+\cdots+\binom{c_k}k,$$

and this is the index (starting from 0) of C in the ordered list of k-combinations.

Obviously there is for every N ∈ N exactly one k-combination at index N in the list (supposing k ≥ 1, since the list is then infinite), so the above argument proves that every N can be written in exactly one way as a sum of k binomial coefficients of the given form.

Finding the k-combination for a given number
The given formula allows finding the place in the lexicographic ordering of a given k-combination immediately. The reverse process of finding the k-combination at a given place N requires somewhat more work, but is straightforward nonetheless. By the definition of the lexicographic ordering, two k-combinations that differ in their largest element ck will be ordered according to the comparison of those largest elements, from which it follows that all combinations with a fixed value of their largest element are contiguous in the list. Moreover the smallest combination with ck as the largest element is $$\tbinom{c_k}k$$, and it has ci = i − 1 for all i < k (for this combination all terms in the expression except $$\tbinom{c_k}k$$ are zero). Therefore ck is the largest number such that $$\tbinom{c_k}k\leq N$$. If k &gt; 1 the remaining elements of the k-combination form the k − 1-combination corresponding to the number $$N-\tbinom{c_k}k$$ in the combinatorial number system of degree k − 1, and can therefore be found by continuing in the same way for $$N-\tbinom{c_k}k$$ and k − 1 instead of N and  k.

Example
Suppose one wants to determine the 5-combination at position 72. The successive values of $$\tbinom n5$$ for n = 4, 5, 6, ... are 0, 1, 6, 21, 56, 126, 252, ..., of which the largest one not exceeding 72 is 56, for n = 8. Therefore c5 = 8, and the remaining elements form the 4-combination at position 72 − 56 = 16. The successive values of $$\tbinom n4$$ for n = 3, 4, 5, ... are 0, 1, 5, 15, 35, ..., of which the largest one not exceeding 16 is 15, for n = 6, so c4 = 6. Continuing similarly to search for a 3-combination at position 16 − 15 = 1 one finds c3 = 3, which uses up the final unit; this establishes $$72=\tbinom85+\tbinom64+\tbinom33$$, and the remaining values ci will be the maximal ones with $$\tbinom{c_i}i=0$$, namely. Thus we have found the 5-combination {8, 6, 3, 1, 0}.

National Lottery example
For each of the $$\binom{49}6$$ lottery combinations c1 < c2 < c3 < c4 < c5 < c6, there is a list number N between 0 and $$\binom{49}6 - 1$$ which can be found by adding


 * $$ \binom{49-c_1} 6 + \binom{49-c_2} 5 + \binom{49-c_3} 4 + \binom{49-c_4} 3 + \binom{49-c_5} 2 + \binom{49-c_6} 1. $$