Matroid parity problem

In combinatorial optimization, the matroid parity problem is a problem of finding the largest independent set of paired elements in a matroid. The problem was formulated by as a common generalization of graph matching and matroid intersection. It is also known as polymatroid matching, or the matchoid problem.

Matroid parity can be solved in polynomial time for linear matroids. However, it is NP-hard for certain compactly-represented matroids, and requires more than a polynomial number of steps in the matroid oracle model.

Applications of matroid parity algorithms include finding large planar subgraphs and finding graph embeddings of maximum genus. These algorithms can also be used to find connected dominating sets and feedback vertex sets in graphs of maximum degree three.

Formulation
A matroid can be defined from a finite set of elements and from a notion of what it means for subsets of elements to be independent, subject to the following constraints:
 * Every subset of an independent set should be independent.
 * If $$S$$ and $$T$$ are independent sets, with $$|T|>|S|$$, then there exists an element $$t\in T$$ such that $$S\cup\{t\}$$ is independent.

Examples of matroids include the linear matroids (in which the elements are vectors in a vector space, with linear independence), the graphic matroids (in which the elements are edges in an undirected graph, independent when they contain no cycle), and the partition matroids (in which the elements belong to a family of disjoint sets, and are independent when they contain at most one element in each set). Graphic matroids and partition matroids are special cases of linear matroids.

In the matroid parity problem, the input consists of a matroid together with a pairing on its elements, so that each element belongs to one pair. The goal is to find a subset of the pairs, as large as possible, so that the union of the pairs in the chosen subset is independent. Another seemingly more general variation, in which the allowable pairs form a graph rather than having only one pair per element, is equivalent: an element appearing in more than one pair could be replaced by multiple copies of the element, one per pair.

Algorithms
The matroid parity problem for linear matroids can be solved by a randomized algorithm in time $$O(nr^{\omega-1})$$, where $$n$$ is the number of elements of the matroid, $$r$$ is its rank (the size of the largest independent set), and $$\omega$$ is the exponent in the time bounds for fast matrix multiplication. In particular, using a matrix multiplication algorithm of Le Gall, it can be solved in time $$O(nr^{1.3729})$$. Without using fast matrix multiplication, the linear matroid parity problem can be solved in time $$O(nr^2)$$.

These algorithms are based on a linear algebra formulation of the problem by. Suppose that an input to the problem consists of $$m$$ pairs of $$r$$-dimensional vectors (arranged as column vectors in a matrix $$M$$ of size $$r\times 2m$$). Then the number of pairs in the optimal solution is


 * $$\frac{1}{2}\operatorname{rank}\begin{pmatrix}0&M\\M^T&T\end{pmatrix} -m,$$

where $$T$$ is a block diagonal matrix whose blocks are $$2\times 2$$ submatrices of the form


 * $$\begin{pmatrix}0&t_i\\-t_i&0\end{pmatrix}$$

for a sequence of variables $$t_1,\dots t_m$$. The Schwartz–Zippel lemma can be used to test whether this matrix has full rank or not (that is, whether the solution has size $$r/2$$ or not), by assigning random values to the variables $$t_i$$ and testing whether the resulting matrix has determinant zero. By applying a greedy algorithm that removes pairs one at a time by setting their indeterminates to zero as long as the matrix remains of full rank (maintaining the inverse matrix using the Sherman–Morrison formula to check the rank after each removal), one can find a solution whenever this test shows that it exists. Additional methods extend this algorithm to the case that the optimal solution to the matroid parity problem has fewer than $$r/2$$ pairs.

For graphic matroids, more efficient algorithms are known, with running time $$O(mn\log^6 n)$$ on graphs with $$m$$ vertices and $$n$$ edges. For simple graphs, $$m$$ is $$O(n^2)$$, but for multigraphs, it may be larger, so it is also of interest to have algorithms with smaller or no dependence on $$m$$ and worse dependence on $$n$$. In these cases, it is also possible to solve the graphic matroid parity problem in randomized expected time $$O(n^4)$$, or in time $$O(n^3)$$ when each pair of edges forms a path.

Although the matroid parity problem is NP-hard for arbitrary matroids, it can still be approximated efficiently. Simple local search algorithms provide a polynomial-time approximation scheme for this problem, and find solutions whose size, as a fraction of the optimal solution size, is arbitrarily close to one. The algorithm starts with the empty set as its solution, and repeatedly attempts to increase the solution size by one by removing at most a constant number $$C$$ of pairs from the solution and replacing them by a different set with one more pair. When no more such moves are possible, the resulting solution is returned as the approximation to the optimal solution. To achieve an approximation ratio of $$1-\epsilon$$, it suffices to choose $$C$$ to be approximately $$5^{\lceil 1/2\epsilon\rceil}$$.

Applications
Many other optimization problems can be formulated as linear matroid parity problems, and solved in polynomial time using this formulation.

Graph matching: A maximum matching in a graph is a subset of edges, no two sharing an endpoint, that is as large as possible. It can be formulated as a matroid parity problem in a partition matroid that has an element for each vertex-edge incidence in the graph. In this matroid, two elements are paired if they are the two incidences for the same edge as each other. A subset of elements is independent if it has at most one incidence for each vertex of the graph. The pairs of elements in a solution to the matroid parity problem for this matroid are the incidences between edges in a maximum matching and their endpoints.

Matroid intersection: An instance of the matroid intersection problem consists of two matroids on the same set of elements; the goal is to find a subset of the elements that is as large as possible and is independent in both matroids. To formulate matroid intersection as a matroid parity problem, construct a new matroid whose elements are the disjoint union of two copies of the given elements, one for each input matroid. In the new matroid, a subset is independent if its restriction to each of the two copies is independent in each of the two matroids, respectively. Pair the elements of the new matroid so that each element is paired with its copy. The pairs of elements in a solution to the matroid parity problem for this matroid are the two copies of each element in a solution to the matroid intersection problem.

Large planar subgraphs: In an arbitrary graph, the problem of finding the largest set of triangles in a given graph, with no cycles other than the chosen triangles, can be formulated as a matroid parity problem on a graphic matroid whose elements are edges of the graph, with one pair of edges per triangle (duplicating edges if necessary when they belong to more than one triangle). The pairs of elements in a solution to the matroid parity problem for this matroid are the two edges in each triangle of an optimal set of triangles. The same problem can also be described as one of finding the largest Berge-acyclic sub-hypergraph of a 3-uniform hypergraph. In the hypergraph version of the problem, the hyper-edges are the triangles of the given graph.

A cactus graph is a graph in which each two cycles have at most one vertex in common. As a special case, the graphs in which each cycle is a triangle are necessarily cactus graphs. The largest triangular cactus in the given graph can then be found by adding additional edges from a spanning tree, without creating any new cycles, so that the resulting subgraph has the same connected components as the original graph. Cactus graphs are automatically planar graphs, and the problem of finding triangular cactus graphs forms the basis for the best known approximation algorithm to the problem of finding the largest planar subgraph of a given graph, an important step in planarization. The largest triangular cactus always has at least 4/9 the number of edges of the largest planar subgraph, improving the 1/3 approximation ratio obtained by using an arbitrary spanning tree.

Combinatorial rigidity: A framework of rigid bars in the Euclidean plane, connected at their endpoints at flexible joints, can be fixed into a single position in the plane by pinning some of its joints to points of the plane. The minimum number of joints that need to be pinned to fix the framework is called its pinning number. It can be computed from a solution to an associated matroid parity problem.

Maximum-genus embeddings: Xuong tree.svg A cellular embedding of a given graph onto a surface of the maximum possible genus can be obtained from a Xuong tree of the graph. This is a spanning tree with the property that, in the subgraph of edges not in the tree, the number of connected components with an odd number of edges is as small as possible.

To formulate the problem of finding a Xuong tree as a matroid parity problem, first subdivide each edge $e$ of the given graph into a path, with the length of the path equal to the number of other edges incident to $e$. Then, pair the edges of the subdivided graph, so that each pair of edges in the original graph is represented by a single pair of edges in the subdivided graph, and each edge in the subdivided graph is paired exactly once. Solve a matroid parity problem with the paired edges of the subdivided graph, using its cographic matroid (a linear matroid in which a subset of edges is independent if its removal does not separate the graph). Any spanning tree of the original graph that avoids the edges used in the matroid parity solution is necessarily a Xuong tree.

Connected domination: A connected dominating set in a graph is a subset of vertices whose induced subgraph is connected, adjacent to all other vertices. It is NP-hard to find the smallest connected dominating set in arbitrary graphs, but can be found in polynomial time for graphs of maximum degree three. In a cubic graph, one can replace each vertex by a two-edge path connected to the ends of its three endpoints, and formulate a matroid parity problem on the pairs of edges generated in this way, using the cographic matroid of the expanded graph. The vertices whose paths are not used in the solution form a minimum connected dominating set. In a graph of maximum degree three, some simple additional transformations reduce the problem to one on a cubic graph.

Feedback vertex set: A feedback vertex set in a graph is a subset of vertices that touches all cycles. In cubic graphs, there is a linear equation relating the number of vertices, cyclomatic number, number of connected components, size of a minimum connected dominating set, and size of a minimum feedback vertex set. It follows that the same matroid parity problem used to find connected dominating sets can also be used to find feedback vertex sets in graphs of maximum degree three.

Hardness
The clique problem, of finding a $$k$$-vertex complete subgraph in a given $$n$$-vertex graph $$G$$, can be transformed into an instance of matroid parity as follows. Construct a paving matroid on $$2n$$ elements, paired up so that there is one pair of elements per pair of vertices. Define a subset $$S$$ of these elements to be independent if it satisfies any one of the following three conditions: Then there is a solution to the matroid parity problem for this matroid, of size $$2k$$, if and only if $$G$$ has a clique of size $$k$$. Since finding cliques of a given size is NP-complete, it follows that determining whether this type of matrix parity problem has a solution of size $$2k$$ is also NP-complete.
 * $$S$$ has fewer than $$2k$$ elements.
 * $$S$$ has exactly $$2k$$ elements, but is not the union of $$k$$ pairs of elements.
 * $$S$$ is the union of $$k$$ pairs of elements that form a clique in $$G$$.

This problem transformation does not depend on the structure of the clique problem in any deep way, and would work for any other problem of finding size-$$k$$ subsets of a larger set that satisfy a computable test. By applying it to a randomly-permuted graph that contains exactly one clique of size $$k$$, one can show that any deterministic or randomized algorithm for matroid parity that accesses its matroid only by independence tests needs to make an exponential number of tests.