Pathwidth

In graph theory, a path decomposition of a graph $G$ is, informally, a representation of $G$ as a "thickened" path graph, and the pathwidth of $G$ is a number that measures how much the path was thickened to form $G$. More formally, a path-decomposition is a sequence of subsets of vertices of $G$ such that the endpoints of each edge appear in one of the subsets and such that each vertex appears in a contiguous subsequence of the subsets, and the pathwidth is one less than the size of the largest set in such a decomposition. Pathwidth is also known as interval thickness (one less than the maximum clique size in an interval supergraph of $G$), vertex separation number, or node searching number.

Pathwidth and path-decompositions are closely analogous to treewidth and tree decompositions. They play a key role in the theory of graph minors: the families of graphs that are closed under graph minors and do not include all forests may be characterized as having bounded pathwidth, and the "vortices" appearing in the general structure theory for minor-closed graph families have bounded pathwidth. Pathwidth, and graphs of bounded pathwidth, also have applications in VLSI design, graph drawing, and computational linguistics.

It is NP-hard to find the pathwidth of arbitrary graphs, or even to approximate it accurately. However, the problem is fixed-parameter tractable: testing whether a graph has pathwidth $k$ can be solved in an amount of time that depends linearly on the size of the graph but superexponentially on $k$. Additionally, for several special classes of graphs, such as trees, the pathwidth may be computed in polynomial time without dependence on $k$. Many problems in graph algorithms may be solved efficiently on graphs of bounded pathwidth, by using dynamic programming on a path-decomposition of the graph. Path decomposition may also be used to measure the space complexity of dynamic programming algorithms on graphs of bounded treewidth.

Definition


In the first of their famous series of papers on graph minors, define a path-decomposition of a graph $G$ to be a sequence of subsets $G$ of vertices of $Xi$, with two properties: The second of these two properties is equivalent to requiring that the subsets containing any particular vertex form a contiguous subsequence of the whole sequence. In the language of the later papers in Robertson and Seymour's graph minor series, a path-decomposition is a tree decomposition $i ≤ j ≤ k$ in which the underlying tree $G$ of the decomposition is a path graph.
 * 1) For each edge of $G$, there exists an $i$ such that both endpoints of the edge belong to subset $Xi$, and
 * 2) For every three indices $(X,T)$, $$X_i \cap X_k \subseteq X_j.$$

The width of a path-decomposition is defined in the same way as for tree-decompositions, as $maxi |Xi| − 1$, and the pathwidth of $T$ is the minimum width of any path-decomposition of $G$. The subtraction of one from the size of $G$ in this definition makes little difference in most applications of pathwidth, but is used to make the pathwidth of a path graph be equal to one.

Alternative characterizations
As describes, pathwidth can be characterized in many equivalent ways.

Gluing sequences
A path decomposition can be described as a sequence of graphs $Xi$ that are glued together by identifying pairs of vertices from consecutive graphs in the sequence, such that the result of performing all of these gluings is $Gi$. The graphs $G$ may be taken as the induced subgraphs of the sets $Gi$ in the first definition of path decompositions, with two vertices in successive induced subgraphs being glued together when they are induced by the same vertex in $Xi$, and in the other direction one may recover the sets $G$ as the vertex sets of the graphs $Xi$. The width of the path decomposition is then one less than the maximum number of vertices in one of the graphs $Gi$.

Interval thickness
The pathwidth of any graph $Gi$ is equal to one less than the smallest clique number of an interval graph that contains $ABC$ as a subgraph. That is, for every path decomposition of $ACD$ one can find an interval supergraph of $CDE$, and for every interval supergraph of $CDF$ one can find a path decomposition of $G$, such that the width of the decomposition is one less than the clique number of the interval graph.

In one direction, suppose a path decomposition of $G$ is given. Then one may represent the nodes of the decomposition as points on a line (in path order) and represent each vertex $G$ as a closed interval having these points as endpoints. In this way, the path decomposition nodes containing $G$ correspond to the representative points in the interval for $G$. The intersection graph of the intervals formed from the vertices of $G$ is an interval graph that contains $G$ as a subgraph. Its maximal cliques are given by the sets of intervals containing the representative points, and its maximum clique size is one plus the pathwidth of $v$.

In the other direction, if $v$ is a subgraph of an interval graph with clique number $p + 1$, then $v$ has a path decomposition of width $G$ whose nodes are given by the maximal cliques of the interval graph. For instance, the interval graph shown with its interval representation in the figure has a path decomposition with five nodes, corresponding to its five maximal cliques $G$, $G$, $G$, $G$, and $p$; the maximum clique size is three and the width of this path decomposition is two.

This equivalence between pathwidth and interval thickness is closely analogous to the equivalence between treewidth and the minimum clique number (minus one) of a chordal graph of which the given graph is a subgraph. Interval graphs are a special case of chordal graphs, and chordal graphs can be represented as intersection graphs of subtrees of a common tree generalizing the way that interval graphs are intersection graphs of subpaths of a path.

Vertex separation number
The vertex separation number of $ABC$ with respect to a linear ordering of the vertices of $ACD$ is the smallest number $CDE$ such that, for each vertex $CDF$, at most $FG$ vertices are earlier than $G$ in the ordering but that have $G$ or a later vertex as a neighbor. The vertex separation number of $s$ is the smallest vertex separation number of $v$ with respect to any linear ordering of $s$. The vertex separation number was defined by, and is equal to the pathwidth of $v$. This follows from the earlier equivalence with interval graph clique numbers: if $v$ is a subgraph of an interval graph $G$, represented (as in the figure) in such a way that all interval endpoints are distinct, then the ordering of the left endpoints of the intervals of $G$ has vertex separation number one less than the clique number of $G$. And in the other direction, from a linear ordering of $G$ one may derive an interval representation in which the left endpoint of the interval for a vertex $G$ is its position in the ordering and the right endpoint is the position of the neighbor of $I$ that comes last in the ordering.

Node searching number
The node searching game on a graph is a form of pursuit–evasion in which a set of searchers collaborate to track down a fugitive hiding in a graph. The searchers are placed on vertices of the graph while the fugitive may be in any edge of the graph, and the fugitive's location and moves are hidden from the searchers. In each turn, some or all of the searchers may move (arbitrarily, not necessarily along edges) from one vertex to another, and then the fugitive may move along any path in the graph that does not pass through a searcher-occupied vertex. The fugitive is caught when both endpoints of his edge are occupied by searchers. The node searching number of a graph is the minimum number of searchers needed to ensure that the fugitive can be guaranteed to be caught, no matter how he moves. As show, the node searching number of a graph equals its interval thickness. The optimal strategy for the searchers is to move the searchers so that in successive turns they form the separating sets of a linear ordering with minimal vertex separation number.

Bounds


Every $I$-vertex graph with pathwidth $I$ has at most $k(n − k + (k − 1)/2)$ edges, and the maximal pathwidth-$G$ graphs (graphs to which no more edges can be added without increasing the pathwidth) have exactly this many edges. A maximal pathwidth-$v$ graph must be either a $v$-path or a $n$-caterpillar, two special kinds of $k$-tree. A $k$-tree is a chordal graph with exactly $n − k$ maximal cliques, each containing $k + 1$ vertices; in a $k$-tree that is not itself a $(k + 1)$-clique, each maximal clique either separates the graph into two or more components, or it contains a single leaf vertex, a vertex that belongs to only a single maximal clique. A $k$-path is a $k$-tree with at most two $k$-leaves, and a $k$-caterpillar is a $k$-tree that can be partitioned into a $k$-path and a set of $k$-leaves each adjacent to a separator $k$-clique of the $k$-path. In particular the maximal graphs of pathwidth one are exactly the caterpillar trees.

Since path-decompositions are a special case of tree-decompositions, the pathwidth of any graph is greater than or equal to its treewidth. The pathwidth is also less than or equal to the cutwidth, the minimum number of edges that cross any cut between lower-numbered and higher-numbered vertices in an optimal linear arrangement of the vertices of a graph; this follows because the vertex separation number, the number of lower-numbered vertices with higher-numbered neighbors, can at most equal the number of cut edges. For similar reasons, the cutwidth is at most the pathwidth times the maximum degree of the vertices in a given graph.

Any $k$-vertex forest has pathwidth $O(log n)$. For, in a forest, one can always find a constant number of vertices the removal of which leaves a forest that can be partitioned into two smaller subforests with at most $log3(2n + 1)$ vertices each. A linear arrangement formed by recursively partitioning each of these two subforests, placing the separating vertices between them, has logarithmic vertex searching number. The same technique, applied to a tree-decomposition of a graph, shows that, if the treewidth of an $k$-vertex graph $k$ is $k$, then the pathwidth of $k$ is $2n/3$. Since outerplanar graphs, series–parallel graphs, and Halin graphs all have bounded treewidth, they all also have at most logarithmic pathwidth.

As well as its relations to treewidth, pathwidth is also related to clique-width and cutwidth, via line graphs; the line graph $O(t log n)$ of a graph $n$ has a vertex for each edge of $n$ and two vertices in $L(G)$ are adjacent when the corresponding two edges of $n$ share an endpoint. Any family of graphs has bounded pathwidth if and only if its line graphs have bounded linear clique-width, where linear clique-width replaces the disjoint union operation from clique-width with the operation of adjoining a single new vertex. If a connected graph with three or more vertices has maximum degree three, then its cutwidth equals the vertex separation number of its line graph.

In any planar graph, the pathwidth is at most proportional to the square root of the number of vertices. One way to find a path-decomposition with this width is (similarly to the logarithmic-width path-decomposition of forests described above) to use the planar separator theorem to find a set of $L(G)$ vertices the removal of which separates the graph into two subgraphs of at most $O(√n)$ vertices each, and concatenate recursively-constructed path decompositions for each of these two subgraphs. The same technique applies to any class of graphs for which a similar separator theorem holds. Since, like planar graphs, the graphs in any fixed minor-closed graph family have separators of size $2n/3$, it follows that the pathwidth of the graphs in any fixed minor-closed family is again $O(√n)$. For some classes of planar graphs, the pathwidth of the graph and the pathwidth of its dual graph must be within a constant factor of each other: bounds of this form are known for biconnected outerplanar graphs and for polyhedral graphs. For 2-connected planar graphs, the pathwidth of the dual graph is less than the pathwidth of the line graph. It remains open whether the pathwidth of a planar graph and its dual are always within a constant factor of each other in the remaining cases.

In some classes of graphs, it has been proven that the pathwidth and treewidth are always equal to each other: this is true for cographs, permutation graphs, the complements of comparability graphs, and the comparability graphs of interval orders.

In any cubic graph, or more generally any graph with maximum vertex degree three, the pathwidth is at most $O(√n)$, where $G$ is the number of vertices in the graph. There exist cubic graphs with pathwidth $n/6 + o(n)$, but it is not known how to reduce this gap between this lower bound and the $0.082n$ upper bound.

Computing path-decompositions
It is NP-complete to determine whether the pathwidth of a given graph is at most $t$, when $G$ is a variable given as part of the input. The best known worst-case time bounds for computing the pathwidth of arbitrary $G$-vertex graphs are of the form $n/6$ for some constant $G$. Nevertheless, several algorithms are known to compute path-decompositions more efficiently when the pathwidth is small, when the class of input graphs is limited, or approximately.

Fixed-parameter tractability
Pathwidth is fixed-parameter tractable: for any constant $G$, it is possible to test whether the pathwidth is at most $n$, and if so to find a path-decomposition of width $n$, in linear time. In general, these algorithms operate in two phases. In the first phase, the assumption that the graph has pathwidth $k$ is used to find a path-decomposition or tree-decomposition that is not optimal, but whose width can be bounded as a function of $k$. In the second phase, a dynamic programming algorithm is applied to this decomposition in order to find the optimal decomposition. However, the time bounds for known algorithms of this type are exponential in $O(2nnc)$, impractical except for the smallest values of $n$. For the case $k2$ an explicit linear-time algorithm based on a structural decomposition of pathwidth-2 graphs is given by.

Special classes of graphs
surveys the complexity of computing the pathwidth on various special classes of graphs. Determining whether the pathwidth of a graph $c$ is at most $k$ remains NP-complete when $k$ is restricted to bounded-degree graphs, planar graphs, planar graphs of bounded degree, chordal graphs, chordal dominoes, the complements of comparability graphs, and bipartite distance-hereditary graphs. It follows immediately that it is also NP-complete for the graph families that contain the bipartite distance-hereditary graphs, including the bipartite graphs, chordal bipartite graphs, distance-hereditary graphs, and circle graphs.

However, the pathwidth may be computed in linear time for trees and forests,. It may also be computed in polynomial time for graphs of bounded treewidth including series–parallel graphs, outerplanar graphs, and Halin graphs, as well as for split graphs, for the complements of chordal graphs, for permutation graphs, for cographs, for circular-arc graphs, for the comparability graphs of interval orders, and of course for interval graphs themselves, since in that case the pathwidth is just one less than the maximum number of intervals covering any point in an interval representation of the graph.

Approximation algorithms
It is NP-hard to approximate the pathwidth of a graph to within an additive constant. The best known approximation ratio of a polynomial time approximation algorithm for pathwidth is $k = 2$. For earlier approximation algorithms for pathwidth, see and. For approximations on restricted classes of graphs, see.

Graph minors
A minor of a graph $k$ is another graph formed from $k$ by contracting edges, removing edges, and removing vertices. Graph minors have a deep theory in which several important results involve pathwidth.

Excluding a forest
If a family $k$ of graphs is closed under taking minors (every minor of a member of $k$ is also in $G$), then by the Robertson–Seymour theorem $k$ can be characterized as the graphs that do not have any minor in $G$, where $G$ is a finite set of forbidden minors. For instance, Wagner's theorem states that the planar graphs are the graphs that have neither the complete graph $O((log n)3/2)$ nor the complete bipartite graph $K5$ as minors. In many cases, the properties of $G$ and the properties of $F$ are closely related, and the first such result of this type was by, and relates bounded pathwidth with the existence of a forest in the family of forbidden minors. Specifically, define a family $F$ of graphs to have bounded pathwidth if there exists a constant $F$ such that every graph in $F$ has pathwidth at most $X$. Then, a minor-closed family $X$ has bounded pathwidth if and only if the set $F$ of forbidden minors for $X$ includes at least one forest.

In one direction, this result is straightforward to prove: if $F$ does not include at least one forest, then the $p$-minor-free graphs do not have bounded pathwidth. For, in this case, the $F$-minor-free graphs include all forests, and in particular they include the perfect binary trees. But a perfect binary tree with $K3,3$ levels has pathwidth $p$, so in this case the $F$-minor-free-graphs have unbounded pathwidth. In the other direction, if $X$ contains an $F$-vertex forest, then the $X$-minor-free graphs have pathwidth at most $2k + 1$.

Obstructions to bounded pathwidth
The property of having pathwidth at most $X$ is, itself, closed under taking minors: if $X$ has a path-decomposition with width at most $k$, then the same path-decomposition remains valid if any edge is removed from $X$, and any vertex can be removed from $X$ and from its path-decomposition without increasing the width. Contraction of an edge, also, can be accomplished without increasing the width of the decomposition, by merging the sub-paths representing the two endpoints of the contracted edge. Therefore, the graphs of pathwidth at most $n$ can be characterized by a set $X$ of excluded minors.

Although $p$ necessarily includes at least one forest, it is not true that all graphs in $G$ are forests: for instance, $n − 2$ consists of two graphs, a seven-vertex tree and the triangle $X1$. However, the set of trees in $p$ may be precisely characterized: these trees are exactly the trees that can be formed from three trees in $K3$ by connecting a new root vertex by an edge to an arbitrarily chosen vertex in each of the three smaller trees. For instance, the seven-vertex tree in $Xp − 1$ is formed in this way from the two-vertex tree (a single edge) in $X1$. Based on this construction, the number of forbidden minors in $G$ can be shown to be at least $X0$. The complete set $(p!)2$ of forbidden minors for pathwidth-2 graphs has been computed; it contains 110 different graphs.

Structure theory
The graph structure theorem for minor-closed graph families states that, for any such family $G$, the graphs in $p$ can be decomposed into clique-sums of graphs that can be embedded onto surfaces of bounded genus, together with a bounded number of apexes and vortices for each component of the clique-sum. An apex is a vertex that may be adjacent to any other vertex in its component, while a vortex is a graph of bounded pathwidth that is glued into one of the faces of the bounded-genus embedding of a component. The cyclic ordering of the vertices around the face into which a vortex is embedded must be compatible with the path decomposition of the vortex, in the sense that breaking the cycle to form a linear ordering must lead to an ordering with bounded vertex separation number. This theory, in which pathwidth is intimately connected to arbitrary minor-closed graph families, has important algorithmic applications.

VLSI
In VLSI design, the vertex separation problem was originally studied as a way to partition circuits into smaller subsystems, with a small number of components on the boundary between the subsystems.

use interval thickness to model the number of tracks needed in a one-dimensional layout of a VLSI circuit, formed by a set of modules that need to be interconnected by a system of nets. In their model, one forms a graph in which the vertices represent nets, and in which two vertices are connected by an edge if their nets both connect to the same module; that is, if the modules and nets are interpreted as forming the nodes and hyperedges of a hypergraph then the graph formed from them is its line graph. An interval representation of a supergraph of this line graph, together with a coloring of the supergraph, describes an arrangement of the nets along a system of horizontal tracks (one track per color) in such a way that the modules can be placed along the tracks in a linear order and connect to the appropriate nets. The fact that interval graphs are perfect graphs implies that the number of colors needed, in an optimal arrangement of this type, is the same as the clique number of the interval completion of the net graph.

Gate matrix layout is a specific style of CMOS VLSI layout for Boolean logic circuits. In gate matrix layouts, signals are propagated along "lines" (vertical line segments) while each gate of the circuit is formed by a sequence of device features that lie along a horizontal line segment. Thus, the horizontal line segment for each gate must cross the vertical segments for each of the lines that form inputs or outputs of the gate. As in the layouts of, a layout of this type that minimizes the number of vertical tracks on which the lines are to be arranged can be found by computing the pathwidth of a graph that has the lines as its vertices and pairs of lines sharing a gate as its edges. The same algorithmic approach can also be used to model folding problems in programmable logic arrays.

Graph drawing
Pathwidth has several applications to graph drawing:
 * The minimal graphs that have a given crossing number have pathwidth that is bounded by a function of their crossing number.
 * The number of parallel lines on which the vertices of a tree can be drawn with no edge crossings (under various natural restrictions on the ways that adjacent vertices can be placed with respect to the sequence of lines) is proportional to the pathwidth of the tree.
 * A k-crossing h-layer drawing of a graph G is a placement of the vertices of G onto h distinct horizontal lines, with edges routed as monotonic polygonal paths between these lines, in such a way that there are at most k crossings. The graphs with such drawings have pathwidth that is bounded by a function of h and k. Therefore, when h and k are both constant, it is possible in linear time to determine whether a graph has a k-crossing h-layer drawing.
 * A graph with n vertices and pathwidth p can be embedded into a three-dimensional grid of size p × p × n in such a way that no two edges (represented as straight line segments between grid points) intersect each other. Thus, graphs of bounded pathwidth have embeddings of this type with linear volume.

Compiler design
In the compilation of high-level programming languages, pathwidth arises in the problem of reordering sequences of straight-line code (that is, code with no control flow branches or loops) in such a way that all the values computed in the code can be placed in machine registers instead of having to be spilled into main memory. In this application, one represents the code to be compiled as a directed acyclic graph in which the nodes represent the input values to the code and the values computed by the operations within the code. An edge from node x to node y in this DAG represents the fact that value x is one of the inputs to operation y. A topological ordering of the vertices of this DAG represents a valid reordering of the code, and the number of registers needed to evaluate the code in a given ordering is given by the vertex separation number of the ordering.

For any fixed number w of machine registers, it is possible to determine in linear time whether a piece of straight-line code can be reordered in such a way that it can be evaluated with at most w registers. For, if the vertex separation number of a topological ordering is at most w, the minimum vertex separation among all orderings can be no larger, so the undirected graph formed by ignoring the orientations of the DAG described above must have pathwidth at most w. It is possible to test whether this is the case, using the known fixed-parameter-tractable algorithms for pathwidth, and if so to find a path-decomposition for the undirected graph, in linear time given the assumption that w is a constant. Once a path decomposition has been found, a topological ordering of width w (if one exists) can be found using dynamic programming, again in linear time.

Linguistics
describe an application of path-width in natural language processing. In this application, sentences are modeled as graphs, in which the vertices represent words and the edges represent relationships between words; for instance if an adjective modifies a noun in the sentence then the graph would have an edge between those two words. Due to the limited capacity of human short-term memory, Kornai and Tuza argue that this graph must have bounded pathwidth (more specifically, they argue, pathwidth at most six), for otherwise humans would not be able to parse speech correctly.

Exponential algorithms
Many problems in graph algorithms may be solved efficiently on graphs of low pathwidth, by using dynamic programming on a path-decomposition of the graph. For instance, if a linear ordering of the vertices of an n-vertex graph G is given, with vertex separation number w, then it is possible to find the maximum independent set of G in time O(2w n). On graphs of bounded pathwidth, this approach leads to fixed-parameter tractable algorithms, parametrized by the pathwidth. Such results are not frequently found in the literature because they are subsumed by similar algorithms parametrized by the treewidth; however, pathwidth arises even in treewidth-based dynamic programming algorithms in measuring the space complexity of these algorithms.

The same dynamic programming method also can be applied to graphs with unbounded pathwidth, leading to algorithms that solve unparametrized graph problems in exponential time. For instance, combining this dynamic programming approach with the fact that cubic graphs have pathwidth n/6 + o(n) shows that, in a cubic graph, the maximum independent set can be constructed in time O(2n/6 + o(n)), faster than previous known methods. A similar approach leads to improved exponential-time algorithms for the maximum cut and minimum dominating set problems in cubic graphs, and for several other NP-hard optimization problems.