Twin-width

The twin-width of an undirected graph is a natural number associated with the graph, used to study the parameterized complexity of graph algorithms. Intuitively, it measures how similar the graph is to a cograph, a type of graph that can be reduced to a single vertex by repeatedly merging together twins, vertices that have the same neighbors. The twin-width is defined from a sequence of repeated mergers where the vertices are not required to be twins, but have nearly equal sets of neighbors.

Definition
Twin-width is defined for finite simple undirected graphs. These have a finite set of vertices, and a set of edges that are unordered pairs of vertices. The open neighborhood of any vertex is the set of other vertices that it is paired with in edges of the graph; the closed neighborhood is formed from the open neighborhood by including the vertex itself. Two vertices are true twins when they have the same closed neighborhood, and false twins when they have the same open neighborhood; more generally, both true twins and false twins can be called twins, without qualification.

The cographs have many equivalent definitions, but one of them is that these are the graphs that can be reduced to a single vertex by a process of repeatedly finding any two twin vertices and merging them into a single vertex. For a cograph, this reduction process will always succeed, no matter which choice of twins to merge is made at each step. For a graph that is not a cograph, it will always get stuck in a subgraph with more than two vertices that has no twins.

The definition of twin-width mimics this reduction process. A contraction sequence, in this context, is a sequence of steps, beginning with the given graph, in which each step replaces a pair of vertices by a single vertex. This produces a sequence of graphs, with edges colored red and black; in the given graph, all edges are assumed to be black. When two vertices are replaced by a single vertex, the neighborhood of the new vertex is the union of the neighborhoods of the replaced vertices. In this new neighborhood, an edge that comes from black edges in the neighborhoods of both vertices remains black; all other edges are colored red.

A contraction sequence is called a $$d$$-sequence if, throughout the sequence, every vertex touches at most $$d$$ red edges. The twin-width of a graph is the smallest value of $$d$$ for which it has a $$d$$-sequence.

A dense graph may still have bounded twin-width; for instance, the cographs include all complete graphs. A variation of twin-width, sparse twin-width, applies to families of graphs rather than to individual graphs. For a family of graphs that is closed under taking induced subgraphs and has bounded twin-width, the following properties are equivalent: Such a family is said to have bounded sparse twin-width.
 * The graphs in the family are sparse, meaning that they have a number of edges bounded by a linear function of their number of vertices.
 * The graphs in the family exclude some fixed complete bipartite graph as a subgraph.
 * The family of all subgraphs of graphs in the given family has bounded twin-width.
 * The family has bounded expansion, meaning that all its shallow minors are sparse.

The concept of twin-width can be generalized from graphs to various totally ordered structures (including graphs equipped with a total ordering on their vertices), and is in many ways simpler for ordered structures than for unordered graphs. It is also possible to formulate equivalent definitions for other notions of graph width using contraction sequences with different requirements than having bounded degree.

Graphs of bounded twin-width
Cographs have twin-width zero. In the reduction process for cographs, there will be no red edges: when two vertices are merged, their neighborhoods are equal, so there are no edges coming from only one of the two neighborhoods to be colored red. In any other graph, any contraction sequence will produce some red edges, and the twin-width will be greater than zero.

The path graphs with at most three vertices are cographs, but every larger path graph has twin-width one. For a contraction sequence that repeatedly merges the last two edges of the path, only the edge incident to the single merged vertex will be red, so this is a 1-sequence. Trees have twin-width at most two, and for some trees this is tight. A 2-contraction sequence for any tree may be found by choosing a root, and then repeatedly merging two leaves that have the same parent or, if this is not possible, merging the deepest leaf into its parent. The only red edges connect leaves to their parents, and when there are two at the same parent they can be merged, keeping the red degree at most two.

More generally, the following classes of graphs have bounded twin-width, and a contraction sequence of bounded width can be found for them in polynomial time:
 * Every graph of bounded clique-width, or of bounded rank-width, also has bounded twin-width. The twin-width is at most exponential in the clique-width, and at most doubly exponential in the rank-width. These graphs include, for instance, the distance-hereditary graphs, the $k$-leaf powers for bounded values of $k$, and the graphs of bounded treewidth.
 * Indifference graphs (equivalently, unit interval graphs or proper interval graphs) have twin-width at most two.
 * Unit disk graphs defined from sets of unit disks that cover each point of the plane a bounded number of times have bounded twin-width. The same is true for unit ball graphs in higher dimensions.
 * The permutation graphs coming from permutations with a forbidden permutation pattern have bounded twin-width. This allows twin-width to be applied to algorithmic problems on permutations with forbidden patterns.
 * Every family of graphs defined by forbidden minors has bounded twin-width. For instance, by Wagner's theorem, the forbidden minors for planar graphs are the two graphs $$K_5$$ and $$K_{3,3}$$, so the planar graphs have bounded twin-width.
 * Every graph of bounded stack number or bounded queue number also has bounded twin-width. There exist families of graphs of bounded sparse twin-width that do not have bounded stack number, but the corresponding question for queue number remains open.
 * The strong product of any two graphs of bounded twin-width, one of which has bounded degree, again has bounded twin-width. This can be used to prove the bounded twin-width of classes of graphs that have decompositions into strong products of paths and bounded-treewidth graphs, such as the $k$-planar graphs. For the lexicographic product of graphs, the twin-width is exactly the maximum of the widths of the two factor graphs. Twin-width also behaves well under several other standard graph products, but not the modular product of graphs.

In every hereditary family of graphs of bounded twin-width, it is possible to find a family of total orders for the vertices of its graphs so that the inherited ordering on an induced subgraph is also an ordering in the family, and so that the family is small with respect to these orders. This means that, for a total order on $$n$$ vertices, the number of graphs in the family consistent with that order is at most singly exponential in $$n$$. Conversely, every hereditary family of ordered graphs that is small in this sense has bounded twin-width. It was originally conjectured that every hereditary family of labeled graphs that is small, in the sense that the number of graphs is at most a singly exponential factor times $$n!$$, has bounded twin-width. However, this conjecture was disproved using a family of induced subgraphs of an infinite Cayley graph that are small as labeled graphs but do not have bounded twin-width.

There exist graphs of unbounded twin-width within the following families of graphs: In each of these cases, the result follows by a counting argument: there are more graphs of the given type than there can be graphs of bounded twin-width.
 * Graphs of bounded degree.
 * Interval graphs.
 * Unit disk graphs.

Properties
If a graph has bounded twin-width, then it is possible to find a versatile tree of contractions. This is a large family of contraction sequences, all of some (larger) bounded width, so that at each step in each sequence there are linearly many disjoint pairs of vertices each of which could be contracted at the next step in the sequence. It follows from this that the number of graphs of bounded twin-width on any set of $$n$$ given vertices is larger than $$n!$$ by only a singly exponential factor, that the graphs of bounded twin-width have an adjacency labelling scheme with only a logarithmic number of bits per vertex, and that they have universal graphs of polynomial size in which each $$n$$-vertex graph of bounded twin-width can be found as an induced subgraph.

Algorithms
The graphs of twin-width at most one can be recognized in polynomial time. However, it is NP-complete to determine whether a given graph has twin-width at most four, and NP-hard to approximate the twin-width with an approximation ratio better than 5/4. Under the exponential time hypothesis, computing the twin-width requires time at least exponential in $$n/\log n$$, on $$n$$-vertex graphs. In practice, it is possible to compute the twin-width of graphs of moderate size using SAT solvers. For most of the known families of graphs of bounded twin-width, it is possible to construct a contraction sequence of bounded width in polynomial time.

Once a contraction sequence has been given or constructed, many different algorithmic problems can be solved using it, in many cases more efficiently than is possible for graphs that do not have bounded twin-width. As detailed below, these include exact parameterized algorithms and approximation algorithms for NP-hard problems, as well as some problems that have classical polynomial time algorithms but can nevertheless be sped up using the assumption of bounded twin-width.

Parameterized algorithms
An algorithmic problem on graphs having an associated parameter is called fixed-parameter tractable if it has an algorithm that, on graphs with $$n$$ vertices and parameter value $$k$$, runs in time $$O(n^c\, f(k))$$ for some constant $$c$$ and computable function $$f$$. For instance, a running time of $$O(n2^k)$$ would be fixed-parameter tractable in this sense. This style of analysis is generally applied to problems that do not have a known polynomial-time algorithm, because otherwise fixed-parameter tractability would be trivial. Many such problems have been shown to be fixed-parameter tractable with twin-width as a parameter, when a contraction sequence of bounded width is given as part of the input. This applies, in particular, to the graph families of bounded twin-width listed above, for which a contraction sequence can be constructed efficiently. However, it is not known how to find a good contraction sequence for an arbitrary graph of low twin-width, when no other structure in the graph is known.

The fixed-parameter tractable problems for graphs of bounded twin-width with given contraction sequences include:
 * Testing whether the given graph models any given property in the first-order logic of graphs. Here, both the twin-width and the description length of the property are parameters of the analysis. Problems of this type include subgraph isomorphism for subgraphs of bounded size, and the vertex cover and dominating set problems for covers or dominating sets of bounded size. The dependence of these general methods on the length of the logical formula describing the property is tetrational, but for independent set, dominating set, and related problems it can be reduced to exponential in the size of the independent or dominating set, and for subgraph isomorphism it can be reduced to factorial in the number of vertices of the subgraph. For instance, the time to find a $$k$$-vertex independent set, for an $$n$$-vertex graph with a given $$d$$-sequence, is $$O(k^2d^{2k}n)$$, by a dynamic programming algorithm that considers small connected subgraphs of the red graphs in the forward direction of the contraction sequence. These time bounds are optimal, up to logarithmic factors in the exponent, under the exponential time hypothesis. For an extension of the first-order logic of graphs to graphs with totally ordered vertices, and logical predicates that can test this ordering, model checking is still fixed-parameter tractable for hereditary graph families of bounded twin-width, but not (under standard complexity-theoretic assumptions) for hereditary families of unbounded twin-width.
 * Coloring graphs of bounded twin-width, using a number of colors that is bounded by a function of their twin-width and of the size of their largest clique. For instance, triangle-free graphs of twin-width $$d$$ can be $$(d+2)$$-colored by a greedy coloring algorithm that colors vertices in the reverse of the order they were contracted away. This result shows that the graphs of bounded twin-width are χ-bounded. For graph families of bounded sparse twin-width, the generalized coloring numbers are bounded. Here, the generalized coloring number $$\operatorname{col}_r(G)$$ is at most $$k$$ if the vertices can be linearly ordered in such a way that each vertex can reach at most $$k-1$$ earlier vertices in the ordering, through paths of length $$r$$ through later vertices in the ordering.

Speedups of classical algorithms
In graphs of bounded twin-width, it is possible to perform a breadth-first search, on a graph with $$n$$ vertices, in time $$O(n\log n)$$, even when the graph is dense and has more edges than this time bound.

Approximation algorithms
Twin-width has also been applied in approximation algorithms. In particular, in the graphs of bounded twin-width, it is possible to find an approximation to the minimum dominating set with bounded approximation ratio. This is in contrast to more general graphs, for which it is NP-hard to obtain an approximation ratio that is better than logarithmic.

The maximum independent set and graph coloring problems can be approximated to within an approximation ratio of $$n^{\varepsilon}$$, for every $$\varepsilon>0$$, in polynomial time on graphs of bounded twin-width. In contrast, without the assumption of bounded twin-width, it is NP-hard to achieve any approximation ratio of this form with $$\varepsilon<1$$.