Smith set

The Smith or Schwartz set, sometimes called the top cycle, is a concept from the theory of electoral systems that generalizes the Condorcet winner to cases where no such winner exists. It does so by allowing cycles of candidates to be treated jointly, as if they were a single Condorcet winner.

Named after John H. Smith, the Smith set is the smallest non-empty set of candidates in a particular election, such that each member defeats every candidate outside the set in a pairwise election. The Smith set provides one standard of optimal choice for an election outcome. Voting systems that always elect a candidate from the Smith set pass the Smith criterion.

Definition
The Smith set is formally defined as the smallest set such that every candidate inside the set S is pairwise unbeaten by every candidate outside S.

Alternatively, it can be defined as the set of all candidates with a (non-strict) beatpath to any candidate who defeats them.

A set of candidates each of whose members pairwise defeats every candidate outside the set is known as a dominating set. Thus the Smith set is also called the smallest dominating set.

Strict top-cycle (Schwartz set)
The Schwartz set is equivalent to the Smith set, except it ignores tied votes. Formally, the Schwartz set is the set such that any candidate inside the set has a strict beatpath to any candidate who defeats them.

The Smith set can be constructed from the Schwartz set by repeatedly adding two types of candidates until no more such candidates exist outside the set:


 * candidates that are pairwise-tied with candidates in the set,
 * candidates that defeat a candidate in the set.

Note that candidates of the second type can only exist after candidates of the first type have been added.

Properties

 * The Smith set always exists and is non-empty. It is also well-defined (see next section).
 * The Smith set can have more than one candidate, either because of pairwise ties or because of cycles, such as in Condorcet's paradox.
 * The Condorcet winner, if one exists, is the sole member of the Smith set. If weak Condorcet winners exist they are in the Smith set.
 * The Smith set is always a subset of the mutual majority-preferred set of candidates, if one exists.

Properties of dominating sets
Theorem: Dominating sets are nested; that is, of any two dominating sets in an election, one is a subset of the other.

Proof: Suppose on the contrary that there exist two dominating sets, D and E, neither of which is a subset of the other. Then there must exist candidates d ∈ D, e ∈ E such that d ∉ E and e ∉ D. But by hypothesis d defeats every candidate not in D (including e) while e defeats every candidate not in E (including d), which is a contradiction. ∎

Corollary: It follows that the Smith set is the smallest non-empty dominating set, and that it is well defined.

Theorem: If D is a dominating set, then there is some threshold θD such that the elements of D are precisely the candidates whose Copeland scores are at least θD. (A candidate's Copeland score is the number of other candidates whom he or she defeats plus half the number of other candidates with whom he or she is tied.)

Proof: Choose d as an element of D with minimum Copeland score, and identify this score with θD. Now suppose that some candidate e ∉ D has a Copeland score not less than θD. Then since d belongs to D and e doesn't, it follows that d defeats e; and in order for es Copeland score to be at least equal to ds, there must be some third candidate f against whom e gets a better score than does d. If f ∈ D, then we have an element of D who does not defeat e, and if f ∉ D then we have a candidate outside of D whom d does not defeat, leading to a contradiction either way. ∎

The Smith criterion
The Smith criterion is satisfied by any voting method where the winner always belongs to the Smith set. Any method which satisfies the Smith criterion must also satisfy the Condorcet criterion; hence any method (such as IRV) which is not Condorcet-consistent must also fail the Smith criterion. Minimax is the most well-known Condorcet method that fails the Smith criterion; this failure led to the development of Ranked Pairs and the Schulze method, two methods that are strongly similar to Minimax while electing from the Smith set.

Relation to other tournament sets
The Smith set contains the Copeland set as a subset.

It also contains the Banks set and the Bipartisan set.

Computing the Smith set
The logical properties of dominating sets stated above can be used to construct an efficient algorithm for computing the Smith set. We have seen that the dominating sets are nested by Copeland score. It follows that by adjusting the Copeland threshold it is possible to work through the nested sets in increasing order of size until a dominating set is reached; and this set is necessarily the Smith set. Darlington sketches this method.

Testing whether a set is a dominating set at each stage might repeat some calculations, but this can fairly easily be avoided leading to an algorithm whose work factor is quadratic in the number of candidates.

Detailed algorithm
The algorithm can be presented in detail through an example. Suppose that the results matrix is as follows: Here an entry in the main table is 1 if the first candidate was preferred to the second by more voters than preferred the second to the first; 0 if the opposite relation holds; and $1⁄2$ if there is a tie. The final column gives the Copeland score of the first candidate.

The algorithm to compute the Smith set is agglomerative: it starts with the Copeland set, which is guaranteed to be a subset of it but will often be smaller, and adds items until no more are needed. The first step is to sort the candidates according to score: We look at the highest score (5) and consider the candidates (Copeland winners) whose score is at least this high, i.e. {A,D}. These certainly belong to the Smith set, and any candidates whom they do not defeat will need to be added. To find undefeated candidates we look at the cells in the table below the top-left 2×2 square containing {A,D} (this square is shown with a broken border): the cells in question are shaded yellow in the table. We need to find the lowest (positionally) non-zero entry among these cells, which is the cell in the G row. All candidates as far down as this row, and any lower rows with the same score, need to be added to the set, which expands to {A,D,G}.

Now we look at any new cells which need to be considered, which are those below the top-left square containing {A,D,G}, but excluding those in the first two columns which we have already accounted for. The cells which need attention are shaded pale blue. As before we locate the positionally lowest non-zero entry among the new cells, adding all rows down to it, and all rows with the same score as it, to the expanded set, which now comprises {A,D,G,C}.

We repeat the operation for the new cells below the four members which are known to belong to the Smith set. These are shaded pink, and allow us to find any candidates not defeated by any of {A,D,G,C}. Again there is just one, F, whom we add to the set.

The cells which come into consideration are shaded pale green, and since all their entries are zero we do not need to add any new candidates to the set, which is therefore fixed as {A,D,G,C,F}. And by noticing that all the entries in the black box are zero, we have confirmation that all the candidates above it defeat all the candidates within it.

The following C function illustrates the algorithm by returning the cardinality of the Smith set for a given doubled results matrix r and array s of doubled Copeland scores. There are n candidates; ri j is 2 if more voters prefer i to j than prefer j to i, 1 if the numbers are equal, and 0 if more voters prefer j to i than prefer i to j ; si is the sum over j of the ri j. The candidates are assumed to be sorted in decreasing order of Copeland score.