Exact cover

In the mathematical field of combinatorics, given a collection $$\mathcal{S}$$ of subsets of a set $$X$$, an exact cover is a subcollection $$\mathcal{S}^{*}$$ of $$\mathcal{S}$$ such that each element in $$X$$ is contained in exactly one subset in $$\mathcal{S}^{*}$$. One says that each element in $$X$$ is covered by exactly one subset in $$\mathcal{S}^{*}$$. An exact cover is a kind of cover. It is non-deterministic polynomial time (NP) complete and has a variety of applications, ranging from the optimization of airline flight schedules, cloud computing, and electronic circuit design.

In other words, $$\mathcal{S}^{*}$$ is a partition of $$X$$ consisting of subsets contained in $$\mathcal{S}$$.

The exact cover problem to find an exact cover is a kind of constraint satisfaction problem. The elements of $$\mathcal{S}$$ represent choices and the elements of $$X$$ represent constraints.

An exact cover problem involves the relation contains between subsets and elements. But an exact cover problem can be represented by any heterogeneous relation between a set of choices and a set of constraints. For example, an exact cover problem is equivalent to an exact hitting set problem, an incidence matrix, or a bipartite graph.

In computer science, the exact cover problem is a decision problem to determine if an exact cover exists. The exact cover problem is NP-complete and is one of Karp's 21 NP-complete problems. It is NP-complete even when each subset in $\mathcal{S}$ contains exactly three elements; this restricted problem is known as exact cover by 3-sets, often abbreviated X3C.

Knuth's Algorithm X is an algorithm that finds all solutions to an exact cover problem. DLX is the name given to Algorithm X when it is implemented efficiently using Donald Knuth's Dancing Links technique on a computer.

The exact cover problem can be generalized slightly to involve not only exactly-once constraints but also at-most-once constraints. Finding Pentomino tilings and solving Sudoku are noteworthy examples of exact cover problems. The n queens problem is a generalized exact cover problem.

Formal definition
Given a collection $$\mathcal{S}$$ of subsets of a set $$X$$, an exact cover of $$X$$ is a subcollection $$\mathcal{S}^{*}$$ of $$\mathcal{S}$$ that satisfies two conditions:
 * The intersection of any two distinct subsets in $$\mathcal{S}^{*}$$ is empty, i.e., the subsets in $$\mathcal{S}^{*}$$ are pairwise disjoint. In other words, each element in $$X$$ is contained in at most one subset in $$\mathcal{S}^{*}$$.
 * The union of the subsets in $$\mathcal{S}^{*}$$ is $$X$$, i.e., the subsets in $$\mathcal{S}^{*}$$ cover $$X$$. In other words, each element in $$X$$ is contained in at least one subset in $$\mathcal{S}^{*}$$.

In short, an exact cover is exact in the sense that each element in $$X$$ is contained in exactly one subset in $$\mathcal{S}^{*}$$.

Equivalently, an exact cover of $$X$$ is a subcollection $$\mathcal{S}^{*}$$ of $$\mathcal{S}$$ that partitions $$X$$.

For an exact cover of $$X$$ to exist, it is necessary that:
 * The union of the subsets in $$\mathcal{S}$$ is $$X$$. In other words, each element in $$X$$ is contained in at least one subset in $$\mathcal{S}$$.

If the empty set $$\O$$ is contained in $$\mathcal{S}$$, then it makes no difference whether or not it is in any exact cover. Thus it is typical to assume that:
 * The empty set is not in $$\mathcal{S}^{*}$$. In other words, each subset in $$\mathcal{S}^{*}$$ contains at least one element.

Basic examples
Let $$\mathcal{S} = \{N, O, P, E\}$$ be a collection of subsets of a set $$X = \{1, 2, 3, 4\}$$ such that:
 * $$N = \{\}$$,
 * $$O = \{1, 3\}$$,
 * $$P = \{1, 2, 3\}$$, and
 * $$E = \{2, 4\}$$.

The subcollection $$\{O, E\}$$ is an exact cover of $$X$$, since the subsets $$O = \{1, 3\}$$ and $$E = \{2, 4\}$$ are disjoint and their union is $$X = \{1, 2, 3, 4\}$$.

The subcollection $$\{N, O, E\}$$ is also an exact cover of $$X$$. Including the empty set $$N = \{\}$$ makes no difference, as it is disjoint with all subsets and does not change the union.

The subcollection $$\{E, P\}$$ is not an exact cover of $$X$$. Even though the union of the subsets $$E$$ and $$P$$ is $$\{1, 2, 3, 4\} = X$$, the intersection of the subsets $$E$$ and $$P$$, $$\{2\}$$, is not empty. Therefore the subsets $$E$$ and $$P$$ do not meet the disjoint requirement of an exact cover.

The subcollection $$\{N, P\}$$ is also not an exact cover of $$X$$. Even though $$N$$ and $$P$$ are disjoint, their union is not $$X$$, so they fail the cover requirement.

On the other hand, there is no exact cover—indeed, not even a cover—of $$Y = \{1, 2, 3, 4, 5\}$$ because $$\bigcup \mathcal{S} = \{1, 2, 3, 4\}$$ is a proper subset of $$Y$$: None of the subsets in $$\mathcal{S}$$ contains the element 5.

Detailed example
Let $\mathcal{S}$ = {$A$, $B$, $C$, $D$, $E$, $F$} be a collection of subsets of a set $X$ = {1, 2, 3, 4, 5, 6, 7} such that:
 * $A$ = {1, 4, 7};
 * $B$ = { 1, 4 };
 * $C$ = {4, 5, 7};
 * $D$ = { 3, 5 , 6 };
 * $E$ = {2, 3, 6, 7}; and
 * $F$ = { 2, 7 }.

The subcollection $\mathcal{S}$ = {$B$, $D$, $F$} is an exact cover, since each element is covered by (contained in) exactly one selected subset, as the highlighting makes clear.

Moreover, {$B$, $D$, $F$} is the only exact cover, as the following argument demonstrates: Because $A$ and $B$ are the only subsets containing the element 1, an exact cover must contain $A$ or $B$, but not both. If an exact cover contains $A$, then it doesn't contain $B$, $C$, $E$, or $F$, as each of these subsets has the element 1, 4, or 7 in common with $A$. Then $D$ is the only remaining subset, but the subcollection {$A$, $D$} doesn't cover the element 2. In conclusion, there is no exact cover containing $A$. On the other hand, if an exact cover contains $B$, then it doesn't contain $A$ or $C$, as each of these subsets has the element 1 or 4 in common with $B$. Because $D$ is the only remaining subset containing the element 5, $D$ must be part of the exact cover. If an exact cover contains $D$, then it doesn't contain $E$, as $E$ has the elements 3 and 6 in common with $D$. Then $F$ is the only remaining subset, and the subcollection {$B$, $D$, $F$} is indeed an exact cover. See the example in the article on Knuth's Algorithm X for a matrix-based version of this argument.

Representations
An exact cover problem is defined by the heterogeneous relation contains between a collection $\mathcal{S}$ of subsets and a set $X$ of elements. But there is nothing fundamental about subsets and elements.

A representation of an exact cover problem arises whenever there is a heterogeneous relation $R$ ⊆ $S$ × $X$ between a set $S$ of choices and a set $X$ of constraints and the goal is to select a subset $S$ of $S$ such that each element in $X$ is $R$-related to exactly one element in $S$. Here $R$ is the converse of $R$.

In general, $R$ restricted to $X$ × $S$ is a function from $X$ to $S$, which maps each element in $X$ to the unique element in $S$ that is $R$-related to that element in $X$. This function is onto, unless $S$ contains an element (akin to the empty set) that isn't $R$-related to any element in $X$.

Representations of an exact cover problem include an exact hitting set problem, an incidence matrix, and a bipartite graph.

Exact hitting set
In mathematics, given a set $S$ and a collection $\mathcal{X}$ of subsets of $S$, an exact hitting set $S$ is a subset of $S$ such that each subset in $\mathcal{X}$ contains exactly one element in $S$. One says that each subset in $\mathcal{X}$ is hit by exactly one element in $S$.

The exact hitting set problem is a representation of an exact cover problem involving the relation is contained in rather than contains.

For example, let $S$ = {$a$, $b$, $c$, $d$, $e$, $f$} be a set and $\mathcal{X}$ = {$I$, $II$, $III$, $IV$, $V$, $VI$, $VII$} be a collection of subsets of $S$ such that:
 * $I$ = {$a$, $b$ }
 * $II$ = {$e$, $f$ }
 * $III$ = { $d$, $e$}
 * $IV$ = {$a$, b, $c$}
 * $V$ = {$c$, $d$ }
 * $VI$ = { $d$, $e$}
 * $VII$ = {$a$, $c$, $e$, $f$ }

Then $S$ = {$b$, $d$, $f$} is an exact hitting set, since each subset in $\mathcal{X}$ is hit by (contains) exactly one element in $S$, as the highlighting makes clear.

This exact hitting set example is essentially the same as the detailed example above. Displaying the relation is contained in (∈) from elements to subsets makes clear that we have simply replaced lettered subsets with elements and numbered elements with subsets:


 * $a$ ∈ $I$, $IV$, $VII$, $b$; and
 * $I$ ∈ $IV$, $c$, $IV$, $V$; and
 * $VII$ ∈ $d$, $III$, $V$, $VI$; and
 * $e$ ∈ $II$, $III$, $VI$, $VII$; and
 * $f$ ∈ $II$, $VII$, $\mathcal{S}$, $X$; and

Incidence matrix
The relation contains can be represented by an incidence matrix.

The matrix includes one row for each subset in $A$ and one column for each element in $B$. The entry in a particular row and column is 1 if the corresponding subset contains the corresponding element, and is 0 otherwise.

In the matrix representation, an exact cover is a selection of rows such that each column contains a 1 in exactly one selected row. Each row represents a choice and each column represents a constraint.

For example, the relation contains in the detailed example above can be represented by a 6×7 incidence matrix:


 * {| class="wikitable" style="text-align:center;width:17.0em;"

! !! 1 !! 2 !! 3 !! 4 !! 5 !! 6 !! 7 ! $C$ ! $D$ ! $E$ ! $F$ ! $\mathcal{S}$ ! $B$
 * 1 || 0 || 0 || 1 || 0 || 0 || 1
 * 1 || 0 || 0 || 1 || 0 || 0 || 0
 * 0 || 0 || 0 || 1 || 1 || 0 || 1
 * 0 || 0 || 1 || 0 || 1 || 1 || 0
 * 0 || 1 || 1 || 0 || 0 || 1 || 1
 * 0 || 1 || 0 || 0 || 0 || 0 || 1
 * }

Again, the subcollection $D$ = {$F$, $X$, $\mathcal{S}$} is an exact cover, since each column contains a 1 in exactly one selected row, as the highlighting makes clear.

See the example in the article on Knuth's Algorithm X for a matrix-based solution to the detailed example above.

Hypergraph
In turn, the incidence matrix can be seen also as describing a hypergraph. The hypergraph includes one node for each element in $\mathcal{S}$ and one edge for each subset in $X$; each node is included in exactly one of the edges forming the cover.

Bipartite graph
The relation contains can be represented by a bipartite graph.

The vertices of the graph are divided into two disjoint sets, one representing the subsets in $\mathcal{S}$ and another representing the elements in $X$. If a subset contains an element, an edge connects the corresponding vertices in the graph.

In the graph representation, an exact cover is a selection of vertices corresponding to subsets such that each vertex corresponding to an element is connected to exactly one selected vertex.

For example, the relation contains in the detailed example above can be represented by a bipartite graph with 6+7 = 13 vertices:



Again, the subcollection ⇭⇭⇭ = {B, D, F} is an exact cover, since the vertex corresponding to each element in ⇭⇭⇭ is connected to exactly one selected vertex, as the highlighting makes clear.

Finding solutions
Algorithm X is the name Donald Knuth gave for "the most obvious trial-and-error approach" for finding all solutions to the exact cover problem. Technically, Algorithm X is a recursive, nondeterministic, depth-first, backtracking algorithm.

When Algorithm X is implemented efficiently using Donald Knuth's Dancing Links technique on a computer, Knuth calls it DLX. It uses the matrix representation of the problem, implemented as a series of doubly linked lists of the 1s of the matrix: each 1 element has a link to the next 1 above, below, to the left, and to the right of itself. Because exact cover problems tend to be sparse, this representation is usually much more efficient in both size and processing time required. DLX then uses the Dancing Links technique to quickly select permutations of rows as possible solutions and to efficiently backtrack (undo) mistaken guesses.

Generalized exact cover
In a standard exact cover problem, each constraint must be satisfied exactly once. It is a simple generalization to relax this requirement slightly and allow for the possibility that some primary constraints must be satisfied by exactly one choice but other secondary constraints can be satisfied by at most one choice.

As Knuth explains, a generalized exact cover problem can be converted to an equivalent exact cover problem by simply appending one row for each secondary column, containing a single 1 in that column. If in a particular candidate solution a particular secondary column is satisfied, then the added row isn't needed. But if the secondary column isn't satisfied, as is allowed in the generalized problem but not the standard problem, then the added row can be selected to ensure the column is satisfied.

But Knuth goes on to explain that it is better working with the generalized problem directly, because the generalized algorithm is simpler and faster: A simple change to his Algorithm X allows secondary columns to be handled directly.

The N queens problem is an example of a generalized exact cover problem, as the constraints corresponding to the diagonals of the chessboard have a maximum rather than an exact queen count.

Noteworthy examples
Due to its NP-completeness, any problem in NP can be reduced to exact cover problems, which then can be solved with techniques such as Dancing Links. However, for some well known problems, the reduction is particularly direct. For instance, the problem of tiling a board with pentominoes, and solving Sudoku can both be viewed as exact cover problems.

Pentomino tiling
The problem of tiling a 60-square board with the 12 different free pentominoes is an example of an exact cover problem, as Donald Knuth explains in his paper "Dancing links."

For example, consider the problem of tiling with pentominoes an 8×8 chessboard with the 4 central squares removed:


 * {| border="1" cellpadding="5" cellspacing="0"


 * 11 || 12 || 13 || 14 || 15 || 16 || 17 || 18
 * 21 || 22 || 23 || 24 || 25 || 26 || 27 || 28
 * 31 || 32 || 33 || 34 || 35 || 36 || 37 || 38
 * 41 || 42 || 43 ||   ||    || 46 || 47 || 48
 * 51 || 52 || 53 ||   ||    || 56 || 57 || 58
 * 61 || 62 || 63 || 64 || 65 || 66 || 67 || 68
 * 71 || 72 || 73 || 74 || 75 || 76 || 77 || 78
 * 81 || 82 || 83 || 84 || 85 || 86 || 87 || 88
 * }
 * 61 || 62 || 63 || 64 || 65 || 66 || 67 || 68
 * 71 || 72 || 73 || 74 || 75 || 76 || 77 || 78
 * 81 || 82 || 83 || 84 || 85 || 86 || 87 || 88
 * }
 * 81 || 82 || 83 || 84 || 85 || 86 || 87 || 88
 * }
 * }

The problem involves two kinds of constraints:
 * Pentomino: For each of the 12 pentominoes, there is the constraint that it must be placed exactly once. Name these constraints after the corresponding pentominoes: F I L P N T U V W X Y Z.
 * Square: For each of the 60 squares, there is the constraint that it must be covered by a pentomino exactly once. Name these constraints after the corresponding squares in the board: ij, where i is the rank and j is the file.

Thus there are 12+60 = 72 constraints in all.

As both kinds of constraints are exactly-once constraints, the problem is an exact cover problem.

The problem involves many choices, one for each way to place a pentomino on the board. It is convenient to consider each choice as satisfying a set of 6 constraints: 1 constraint for the pentomino being placed and 5 constraints for the five squares where it is placed.

In the case of an 8×8 chessboard with the 4 central squares removed, there are 1568 such choices, for example:
 * {F, 12, 13, 21, 22, 32}
 * {F, 13, 14, 22, 23, 33}
 * {I, 11, 12, 13, 14, 15}
 * {I, 12, 13, 14, 15, 16}
 * {L, 11, 21, 31, 41, 42}
 * {L, 12, 22, 32, 42, 43}
 * {L, 11, 21, 31, 41, 42}
 * {L, 12, 22, 32, 42, 43}

One of many solutions of this exact cover problem is the following set of 12 choices:
 * {I, 11, 12, 13, 14, 15}
 * {N, 16, 26, 27, 37, 47}
 * {L, 17, 18, 28, 38, 48}
 * {U, 21, 22, 31, 41, 42}
 * {X, 23, 32, 33, 34, 43}
 * {W, 24, 25, 35, 36, 46}
 * {P, 51, 52, 53, 62, 63}
 * {F, 56, 64, 65, 66, 75}
 * {Z, 57, 58, 67, 76, 77}
 * {T, 61, 71, 72, 73, 81}
 * {V, 68, 78, 86, 87, 88}
 * {Y, 74, 82, 83, 84, 85}

This set of choices corresponds to the following solution to the pentomino tiling problem:



A pentomino tiling problem is more naturally viewed as an exact cover problem than an exact hitting set problem, because it is more natural to view each choice as a set of constraints than each constraint as a set of choices.

Each choice relates to just 6 constraints, which are easy to enumerate. On the other hand, each constraint relates to many choices, which are harder to enumerate.

Whether viewed as an exact cover problem or an exact hitting set problem, the matrix representation is the same, having 1568 rows corresponding to choices and 72 columns corresponding to constraints. Each row contains a single 1 in the column identifying the pentomino and five 1s in the columns identifying the squares covered by the pentomino.

Using the matrix, a computer can find all solutions relatively quickly, for example, using Dancing Links.

Sudoku
Main articles: Sudoku, Mathematics of Sudoku, Sudoku solving algorithms

The problem in Sudoku is to assign numbers (or digits, values, symbols) to cells (or squares) in a grid so as to satisfy certain constraints.

In the standard 9×9 Sudoku variant, there are four kinds of constraints:
 * Row-Column: Each intersection of a row and column, i.e, each cell, must contain exactly one number.
 * Row-Number: Each row must contain each number exactly once
 * Column-Number: Each column must contain each number exactly once.
 * Box-Number: Each box must contain each number exactly once.

While the first constraint might seem trivial, it is nevertheless needed to ensure there is only one number per cell. Naturally, placing a number into a cell prohibits placing any other number into the now occupied cell.

Solving Sudoku is an exact cover problem. More precisely, solving Sudoku is an exact hitting set problem, which is equivalent to an exact cover problem, when viewed as a problem to select possibilities such that each constraint set contains (i.e., is hit by) exactly one selected possibility.

Each possible assignment of a particular number to a particular cell is a possibility (or candidate). When Sudoku is played with pencil and paper, possibilities are often called pencil marks.

In the standard 9×9 Sudoku variant, in which each of 9×9 cells is assigned one of 9 numbers, there are 9×9×9=729 possibilities. Using obvious notation for rows, columns and numbers, the possibilities can be labeled
 * R1C1#1, R1C1#2, …, R9C9#9.

The fact that each kind of constraint involves exactly one of something is what makes Sudoku an exact hitting set problem. The constraints can be represented by constraint sets. The problem is to select possibilities such that each constraint set contains (i.e., is hit by) exactly one selected possibility.

In the standard 9×9 Sudoku variant, there are four kinds of constraints sets corresponding to the four kinds of constraints:


 * Row-Column: A row-column constraint set contains all the possibilities for the intersection of a particular row and column, i.e., for a cell. For example, the constraint set for row 1 and column 1, which can be labeled R1C1, contains the 9 possibilities for row 1 and column 1 but different numbers:
 * R1C1 = { R1C1#1, R1C1#2, R1C1#3, R1C1#4, R1C1#5, R1C1#6, R1C1#7, R1C1#8, R1C1#9 }.


 * Row-Number: A row-number constraint set contains all the possibilities for a particular row and number. For example, the constraint set for row 1 and number 1, which can be labeled R1#1, contains the 9 possibilities for row 1 and number 1 but different columns:
 * R1#1 = { R1C1#1, R1C2#1, R1C3#1, R1C4#1, R1C5#1, R1C6#1, R1C7#1, R1C8#1, R1C9#1 }.


 * Column-Number: A column-number constraint set contains all the possibilities for a particular column and number. For example, the constraint set for column 1 and number 1, which can be labeled C1#1, contains the 9 possibilities for column 1 and number 1 but different rows:
 * C1#1 = { R1C1#1, R2C1#1, R3C1#1, R4C1#1, R5C1#1, R6C1#1, R7C1#1, R8C1#1, R9C1#1 }.


 * Box-Number: A box-number constraint set contains all the possibilities for a particular box and number. For example, the constraint set for box 1 (in the upper lefthand corner) and number 1, which can be labeled B1#1, contains the 9 possibilities for the cells in box 1 and number 1:
 * B1#1 = { R1C1#1, R1C2#1, R1C3#1, R2C1#1, R2C2#1, R2C3#1, R3C1#1, R3C2#1, R3C3#1 }.

Since there are 9 rows, 9 columns, 9 boxes and 9 numbers, there are 9×9=81 row-column constraint sets, 9×9=81 row-number constraint sets, 9×9=81 column-number constraint sets,  and 9×9=81 box-number constraint sets: 81+81+81+81=324 constraint sets in all.

In brief, the standard 9×9 Sudoku variant is an exact hitting set problem with 729 possibilities and 324 constraint sets. Thus the problem can be represented by a 729×324 matrix.

Although it is difficult to present the entire 729×324 matrix, the general nature of the matrix can be seen from several snapshots:

The complete 729×324 matrix is available from Robert Hanson.

Note that the set of possibilities RxCy#z can be arranged as a 9×9×9 cube in a 3-dimensional space with coordinates x, y, and z. Then each row Rx, column Cy, or number #z is a 9×9×1 "slice" of possibilities; each box Bw is a 9x3×3 "tube" of possibilities; each row-column constraint set RxCy, row-number constraint set Rx#z, or column-number constraint set Cy#z is a 9x1×1 "strip" of possibilities; each box-number constraint set Bw#z is a 3x3×1 "square" of possibilities; and each possibility RxCy#z is a 1x1×1 "cubie" consisting of a single possibility. Moreover, each constraint set or possibility is the intersection of the component sets. For example, R1C2#3 = R1 ∩ C2 ∩ #3, where ∩ denotes set intersection.

Although other Sudoku variations have different numbers of rows, columns, numbers and/or different kinds of constraints, they all involve possibilities and constraint sets, and thus can be seen as exact hitting set problems.

N queens problem
The N queens problem is the problem of placing n chess queens on an n×n chessboard so that no two queens threaten each other. A solution requires that no two queens share the same row, column, or diagonal. It is an example of a generalized exact cover problem.

The problem involves four kinds of constraints:
 * Rank: For each of the N ranks, there must be exactly one queen.
 * File: For each of the N files, there must be exactly one queen.
 * Diagonals: For each of the 2N − 1 diagonals, there must be at most one queen.
 * Reverse diagonals: For each of the 2N − 1 reverse diagonals, there must be at most one queen.

Note that the 2N ranks and files form the primary constraints, while the 4N − 2 diagonal and reverse diagonals form the secondary constraints. Further, because each of first and last diagonals and reverse diagonals involves only one square on the chessboard, these can be omitted and thus one can reduce the number of secondary constraints to 4N − 6. The matrix for the N queens problem then has N2 rows and 6N − 6 columns, each row for a possible queen placement on each square on the chessboard, and each column for each constraint.