Hypergraph

In mathematics, a hypergraph is a generalization of a graph in which an edge can join any number of vertices. In contrast, in an ordinary graph, an edge connects exactly two vertices.

Formally, a directed hypergraph is a pair $$(X,E)$$, where $$X$$ is a set of elements called nodes, vertices, points, or elements and $$E$$ is a set of pairs of subsets of $$X$$. Each of these pairs $$(D,C)\in E$$ is called an edge or hyperedge; the vertex subset $$D$$ is known as its tail or domain, and  $$C$$ as its  head or codomain.

The order of a hypergraph $$(X,E)$$ is the number of vertices in $$X$$. The size of the hypergraph is the number of edges in $$E$$. The order of an edge $$e=(D,C)$$ in a directed hypergraph is $$|e| = (|D|,|C|)$$: that is, the number of vertices in its tail followed by the number of vertices in its head.

The definition above generalizes from a directed graph to a directed hypergraph by defining the head or tail of each edge as a set of vertices ($$C\subseteq X$$ or $$D\subseteq X$$) rather than as a single vertex. A graph is then the special case where each of these sets contains only one element. Hence any standard graph theoretic concept that is independent of the edge orders $$|e|$$ will generalize to hypergraph theory.

Under one definition, an undirected hypergraph $$(X, E)$$ is a directed hypergraph which has a symmetric edge set: If $$(D,C) \in E$$ then $$(C,D) \in E$$. For notational simplicity one can remove the "duplicate" hyperedges since the modifier "undirected" is precisely informing us that they exist: If $$(D,C) \in E$$ then $$(C,D) \vec{\in} E$$ where $$\vec{\in}$$ means implicitly in.

While graph edges connect only 2 nodes, hyperedges connect an arbitrary number of nodes. However, it is often desirable to study hypergraphs where all hyperedges have the same cardinality; a k-uniform hypergraph is a hypergraph such that all its hyperedges have size k. (In other words, one such hypergraph is a collection of sets, each such set a hyperedge connecting k nodes.) So a 2-uniform hypergraph is a graph, a 3-uniform hypergraph is a collection of unordered triples, and so on. An undirected hypergraph is also called a set system or a family of sets drawn from the universal set.

Hypergraphs can be viewed as incidence structures. In particular, there is a bipartite "incidence graph" or "Levi graph" corresponding to every hypergraph, and conversely, every bipartite graph can be regarded as the incidence graph of a hypergraph when it is 2-colored and it is indicated which color class corresponds to hypergraph vertices and which to hypergraph edges.

Hypergraphs have many other names. In computational geometry, an undirected hypergraph may sometimes be called a range space and then the hyperedges are called ranges. In cooperative game theory, hypergraphs are called simple games (voting games); this notion is applied to solve problems in social choice theory. In some literature edges are referred to as hyperlinks or connectors.

The collection of hypergraphs is a category with hypergraph homomorphisms as morphisms.

Applications
Undirected hypergraphs are useful in modelling such things as satisfiability problems, databases, machine learning, and Steiner tree problems. They have been extensively used in machine learning tasks as the data model and classifier regularization (mathematics). The applications include recommender system (communities as hyperedges), image retrieval (correlations as hyperedges), and bioinformatics (biochemical interactions as hyperedges). Representative hypergraph learning techniques include hypergraph spectral clustering that extends the spectral graph theory with hypergraph Laplacian, and hypergraph semi-supervised learning that introduces extra hypergraph structural cost to restrict the learning results. For large scale hypergraphs, a distributed framework built using Apache Spark is also available.

Directed hypergraphs can be used to model things including telephony applications, detecting money laundering, operations research, and transportation planning. They can also be used to model Horn-satisfiability.

Generalizations of concepts from graphs
Many theorems and concepts involving graphs also hold for hypergraphs, in particular:


 * Matching in hypergraphs;
 * Vertex cover in hypergraphs (also known as: transversal);
 * Line graph of a hypergraph;
 * Hypergraph grammar - created by augmenting a class of hypergraphs with a set of replacement rules;
 * Ramsey's theorem;
 * Erdős–Ko–Rado theorem;
 * Kruskal–Katona theorem on uniform hypergraphs;
 * Hall-type theorems for hypergraphs.

In directed hypergraphs: transitive closure, and shortest path problems.

Hypergraph drawing
Although hypergraphs are more difficult to draw on paper than graphs, several researchers have studied methods for the visualization of hypergraphs.

In one possible visual representation for hypergraphs, similar to the standard graph drawing style in which curves in the plane are used to depict graph edges, a hypergraph's vertices are depicted as points, disks, or boxes, and its hyperedges are depicted as trees that have the vertices as their leaves. If the vertices are represented as points, the hyperedges may also be shown as smooth curves that connect sets of points, or as simple closed curves that enclose sets of points.

In another style of hypergraph visualization, the subdivision model of hypergraph drawing, the plane is subdivided into regions, each of which represents a single vertex of the hypergraph. The hyperedges of the hypergraph are represented by contiguous subsets of these regions, which may be indicated by coloring, by drawing outlines around them, or both. An order-n Venn diagram, for instance, may be viewed as a subdivision drawing of a hypergraph with n hyperedges (the curves defining the diagram) and 2n − 1 vertices (represented by the regions into which these curves subdivide the plane). In contrast with the polynomial-time recognition of planar graphs, it is NP-complete to determine whether a hypergraph has a planar subdivision drawing, but the existence of a drawing of this type may be tested efficiently when the adjacency pattern of the regions is constrained to be a path, cycle, or tree.

An alternative representation of the hypergraph called PAOH is shown in the figure on top of this article. Edges are vertical lines connecting vertices. Vertices are aligned on the left. The legend on the right shows the names of the edges. It has been designed for dynamic hypergraphs but can be used for simple hypergraphs as well.

Hypergraph coloring
Classic hypergraph coloring is assigning one of the colors from set $$\{1,2,3,...,\lambda\}$$ to every vertex of a hypergraph in such a way that each hyperedge contains at least two vertices of distinct colors. In other words, there must be no monochromatic hyperedge with cardinality at least 2. In this sense it is a direct generalization of graph coloring. Minimum number of used distinct colors over all colorings is called the chromatic number of a hypergraph.

Hypergraphs for which there exists a coloring using up to k colors are referred to as k-colorable. The 2-colorable hypergraphs are exactly the bipartite ones.

There are many generalizations of classic hypergraph coloring. One of them is the so-called mixed hypergraph coloring, when monochromatic edges are allowed. Some mixed hypergraphs are uncolorable for any number of colors. A general criterion for uncolorability is unknown. When a mixed hypergraph is colorable, then the minimum and maximum number of used colors are called the lower and upper chromatic numbers respectively.

Properties of hypergraphs
A hypergraph can have various properties, such as:
 * Empty - has no edges.
 * Non-simple (or multiple) - has loops (hyperedges with a single vertex) or repeated edges, which means there can be two or more edges containing the same set of vertices.
 * Simple - has no loops and no repeated edges.
 * $$d $$-regular - every vertex has degree $$d $$, i.e., contained in exactly $$d $$ hyperedges.
 * 2-colorable - its vertices can be partitioned into two classes U and V in such a way that each hyperedge with cardinality at least 2 contains at least one vertex from both classes. An alternative term is Property B.
 * Two stronger properties are bipartite and balanced.
 * $$k $$-uniform - each hyperedge contains precisely $$k$$ vertices.
 * $$k $$-partite - the vertices are partitioned into $$k$$ parts, and each hyperedge contains precisely one vertex of each type.
 * Every $$k $$-partite hypergraph (for $$k\geq 2$$) is both  $$k $$-uniform and bipartite (and 2-colorable).
 * Reduced: no hyperedge is a strict subset of another hyperedge; equivalently, every hyperedge is maximal for inclusion. The reduction of a hypergraph is the reduced hypergraph obtained by removing every hyperedge which is included in another hyperedge.
 * Downward-closed - every subset of an undirected hypergraph's edges is a hyperedge too. A downward-closed hypergraph is usually called an abstract simplicial complex. It is generally not reduced, unless all hyperedges have cardinality 1.
 * An abstract simplicial complex with the augmentation property is called a matroid.
 * Laminar: for any two hyperedges, either they are disjoint, or one is included in the other. In other words, the set of hyperedges forms a laminar set family.

Related hypergraphs
Because hypergraph links can have any cardinality, there are several notions of the concept of a subgraph, called subhypergraphs, partial hypergraphs and section hypergraphs.

Let $$H=(X,E)$$ be the hypergraph consisting of vertices


 * $$X = \lbrace x_i \mid i \in I_v \rbrace,$$

and having edge set


 * $$E = \lbrace e_i \mid i\in I_e, e_i \subseteq X, e_i \neq \emptyset \rbrace,$$

where $$I_v$$ and $$I_e$$ are the index sets of the vertices and edges respectively.

A subhypergraph is a hypergraph with some vertices removed. Formally, the subhypergraph $$H_A$$ induced by $$A \subseteq X $$ is defined as


 * $$H_A=\left(A, \lbrace e \cap A \mid e \in E,

e \cap A \neq \emptyset \rbrace \right).$$

An alternative term is the restriction of H to A.

An extension of a subhypergraph is a hypergraph where each hyperedge of $$H$$ which is partially contained in the subhypergraph $$H_A$$ is fully contained in the extension $$Ex(H_A)$$. Formally
 * $$Ex(H_A) = (A \cup A', E' )$$ with $$A' = \bigcup_{e \in E} e \setminus A$$ and $$E' = \lbrace e \in E \mid e \subseteq (A \cup A') \rbrace$$.

The partial hypergraph is a hypergraph with some edges removed. Given a subset $$J \subset I_e$$ of the edge index set, the partial hypergraph generated by $$J$$ is the hypergraph


 * $$\left(X, \lbrace e_i \mid i\in J \rbrace \right).$$

Given a subset $$A\subseteq X$$, the section hypergraph is the partial hypergraph


 * $$H \times A = \left(A, \lbrace e_i \mid

i\in I_e, e_i \subseteq A  \rbrace \right).$$ The dual $$H^*$$ of $$H$$ is a hypergraph whose vertices and edges are interchanged, so that the vertices are given by $$\lbrace e_i \rbrace$$ and whose edges are given by $$\lbrace X_m \rbrace$$ where


 * $$X_m = \lbrace e_i \mid x_m \in e_i \rbrace. $$

When a notion of equality is properly defined, as done below, the operation of taking the dual of a hypergraph is an involution, i.e.,


 * $$\left(H^*\right)^* = H.$$

A connected graph G with the same vertex set as a connected hypergraph H is a host graph for H if every hyperedge of H induces a connected subgraph in G. For a disconnected hypergraph H, G is a host graph if there is a bijection between the connected components of G and of H, such that each connected component G '  of G is a host of the corresponding H ' .

The 2-section (or clique graph, representing graph, primal graph, Gaifman graph) of a hypergraph is the graph with the same vertices of the hypergraph, and edges between all pairs of vertices contained in the same hyperedge.

Incidence matrix
Let $$V = \{v_1, v_2, ~\ldots, ~ v_n\}$$ and $$E = \{e_1, e_2, ~ \ldots ~ e_m\}$$. Every hypergraph has an $$n \times m$$ incidence matrix.

For an undirected hypergraph, $$I = (b_{ij})$$ where
 * $$b_{ij} = \left\{ \begin{matrix} 1 & \mathrm{if} ~ v_i \in e_j \\ 0 & \mathrm{otherwise}. \end{matrix} \right.$$

The transpose $$I^t$$ of the incidence matrix defines a hypergraph $$H^* = (V^*,\ E^*)$$ called the dual of $$H$$, where $$V^*$$ is an m-element set and $$E^*$$ is an n-element set of subsets of $$V^*$$. For $$v^*_j \in V^*$$ and $$e^*_i \in E^*, ~ v^*_j \in e^*_i$$ if and only if $$b_{ij} = 1$$.

For a directed hypergraph, the heads and tails of each hyperedge $$e_j$$ are denoted by $$H(e_j)$$ and $$T(e_j)$$ respectively. $$I = (b_{ij})$$ where
 * $$b_{ij} = \left\{ \begin{matrix} -1 & \mathrm{if} ~ v_i \in T(e_j) \\ 1 & \mathrm{if} ~ v_i \in H(e_j) \\ 0 & \mathrm{otherwise}. \end{matrix} \right.$$

Incidence graph
A hypergraph H may be represented by a bipartite graph BG as follows: the sets X and E are the parts of BG, and (x1, e1) are connected with an edge if and only if vertex x1 is contained in edge e1 in H.

Conversely, any bipartite graph with fixed parts and no unconnected nodes in the second part represents some hypergraph in the manner described above. This bipartite graph is also called incidence graph.

Adjacency matrix
A parallel for the adjacency matrix of a hypergraph can be drawn from the adjacency matrix of a graph. In the case of a graph, the adjacency matrix is a square matrix which indicates whether pairs of vertices are adjacent. Likewise, we can define the adjacency matrix $$A = (a_{ij})$$ for a hypergraph in general where the hyperedges $$e_{k \leq m}$$have real weights $$w_{e_{k}} \in \R$$ with

$$a_{ij} = \left\{ \begin{matrix} w_{e_{k}} & \mathrm{if} ~ (v_i, v_j) \in E \\ 0 & \mathrm{otherwise}. \end{matrix} \right.$$

Cycles
In contrast with ordinary undirected graphs for which there is a single natural notion of cycles and acyclic graphs, there are multiple natural non-equivalent definitions of acyclicity for hypergraphs which collapse to ordinary graph acyclicity for the special case of ordinary graphs.

A first definition of acyclicity for hypergraphs was given by Claude Berge: a hypergraph is Berge-acyclic if its incidence graph (the bipartite graph defined above) is acyclic. This definition is very restrictive: for instance, if a hypergraph has some pair $$v \neq v'$$ of vertices and some pair $$f \neq f'$$ of hyperedges such that $$v, v' \in f$$ and $$v, v' \in f'$$, then it is Berge-cyclic. Berge-cyclicity can obviously be tested in linear time by an exploration of the incidence graph.

We can define a weaker notion of hypergraph acyclicity, later termed α-acyclicity. This notion of acyclicity is equivalent to the hypergraph being conformal (every clique of the primal graph is covered by some hyperedge) and its primal graph being chordal; it is also equivalent to reducibility to the empty graph through the GYO algorithm (also known as Graham's algorithm), a confluent iterative process which removes hyperedges using a generalized definition of ears. In the domain of database theory, it is known that a database schema enjoys certain desirable properties if its underlying hypergraph is α-acyclic. Besides, α-acyclicity is also related to the expressiveness of the guarded fragment of first-order logic.

We can test in linear time if a hypergraph is α-acyclic.

Note that α-acyclicity has the counter-intuitive property that adding hyperedges to an α-cyclic hypergraph may make it α-acyclic (for instance, adding a hyperedge containing all vertices of the hypergraph will always make it α-acyclic). Motivated in part by this perceived shortcoming, Ronald Fagin defined the stronger notions of β-acyclicity and γ-acyclicity. We can state β-acyclicity as the requirement that all subhypergraphs of the hypergraph are α-acyclic, which is equivalent to an earlier definition by Graham. The notion of γ-acyclicity is a more restrictive condition which is equivalent to several desirable properties of database schemas and is related to Bachman diagrams. Both β-acyclicity and γ-acyclicity can be tested in polynomial time.

Those four notions of acyclicity are comparable: Berge-acyclicity implies γ-acyclicity which implies β-acyclicity which implies α-acyclicity. However, none of the reverse implications hold, so those four notions are different.

Isomorphism, symmetry, and equality
A hypergraph homomorphism is a map from the vertex set of one hypergraph to another such that each edge maps to one other edge.

A hypergraph $$H=(X,E)$$ is isomorphic to a hypergraph $$G=(Y,F)$$, written as $$H \simeq G$$ if there exists a bijection


 * $$\phi:X \to Y$$

and a permutation $$\pi$$ of $$I$$ such that


 * $$\phi(e_i) = f_{\pi(i)}$$

The bijection $$\phi$$ is then called the isomorphism of the graphs. Note that


 * $$H \simeq G$$ if and only if $$H^* \simeq G^*$$.

When the edges of a hypergraph are explicitly labeled, one has the additional notion of strong isomorphism. One says that $$H$$ is strongly isomorphic to $$G$$ if the permutation is the identity. One then writes $$H \cong G$$. Note that all strongly isomorphic graphs are isomorphic, but not vice versa.

When the vertices of a hypergraph are explicitly labeled, one has the notions of equivalence, and also of equality. One says that $$H$$ is equivalent to $$G$$, and writes $$H\equiv G$$ if the isomorphism $$\phi$$ has


 * $$\phi(x_n) = y_n$$

and


 * $$\phi(e_i) = f_{\pi(i)}$$

Note that


 * $$H\equiv G$$ if and only if $$H^* \cong G^*$$

If, in addition, the permutation $$\pi$$ is the identity, one says that $$H$$ equals $$G$$, and writes $$H=G$$. Note that, with this definition of equality, graphs are self-dual:


 * $$\left(H^*\right) ^* = H$$

A hypergraph automorphism is an isomorphism from a vertex set into itself, that is a relabeling of vertices. The set of automorphisms of a hypergraph H (= (X, E)) is a group under composition, called the automorphism group of the hypergraph and written Aut(H).

Examples
Consider the hypergraph $$H$$ with edges
 * $$H = \lbrace

e_1 = \lbrace a,b \rbrace, e_2 = \lbrace b,c \rbrace, e_3 = \lbrace c,d \rbrace, e_4 = \lbrace d,a \rbrace, e_5 = \lbrace b,d \rbrace, e_6 = \lbrace a,c \rbrace \rbrace$$ and
 * $$G = \lbrace

f_1 = \lbrace \alpha,\beta \rbrace, f_2 = \lbrace \beta,\gamma \rbrace, f_3 = \lbrace \gamma,\delta \rbrace, f_4 = \lbrace \delta,\alpha \rbrace, f_5 = \lbrace \alpha,\gamma \rbrace, f_6 = \lbrace \beta,\delta \rbrace \rbrace$$

Then clearly $$H$$ and $$G$$ are isomorphic (with $$\phi(a)=\alpha$$, etc.), but they are not strongly isomorphic. So, for example, in $$H$$, vertex $$a$$ meets edges 1, 4 and 6, so that,


 * $$e_1 \cap e_4 \cap e_6 = \lbrace a\rbrace$$

In graph $$G$$, there does not exist any vertex that meets edges 1, 4 and 6:


 * $$f_1 \cap f_4 \cap f_6 = \varnothing$$

In this example, $$H$$ and $$G$$ are equivalent, $$H\equiv G$$, and the duals are strongly isomorphic: $$H^*\cong G^*$$.

Symmetry
The  $$r(H)$$ of a hypergraph $$H$$ is the maximum cardinality of any of the edges in the hypergraph. If all edges have the same cardinality k, the hypergraph is said to be uniform or k-uniform, or is called a k-hypergraph. A graph is just a 2-uniform hypergraph.

The degree d(v) of a vertex v is the number of edges that contain it. H is k-regular if every vertex has degree k.

The dual of a uniform hypergraph is regular and vice versa.

Two vertices x and y of H are called symmetric if there exists an automorphism such that $$\phi(x)=y$$. Two edges $$e_i$$ and $$e_j$$ are said to be symmetric if there exists an automorphism such that $$\phi(e_i)=e_j$$.

A hypergraph is said to be vertex-transitive (or vertex-symmetric) if all of its vertices are symmetric. Similarly, a hypergraph is edge-transitive if all edges are symmetric. If a hypergraph is both edge- and vertex-symmetric, then the hypergraph is simply transitive.

Because of hypergraph duality, the study of edge-transitivity is identical to the study of vertex-transitivity.

Partitions
A partition theorem due to E. Dauber states that, for an edge-transitive hypergraph $$H=(X,E)$$, there exists a partition


 * $$(X_1, X_2,\cdots,X_K)$$

of the vertex set $$X$$ such that the subhypergraph $$H_{X_k}$$ generated by $$X_k$$ is transitive for each $$1\le k \le K$$, and such that


 * $$\sum_{k=1}^K r\left(H_{X_k} \right) = r(H)$$

where $$r(H)$$ is the rank of H.

As a corollary, an edge-transitive hypergraph that is not vertex-transitive is bicolorable.

Graph partitioning (and in particular, hypergraph partitioning) has many applications to IC design and parallel computing. Efficient and scalable hypergraph partitioning algorithms are also important for processing large scale hypergraphs in machine learning tasks.

Further generalizations
One possible generalization of a hypergraph is to allow edges to point at other edges. There are two variations of this generalization. In one, the edges consist not only of a set of vertices, but may also contain subsets of vertices, subsets of subsets of vertices and so on ad infinitum. In essence, every edge is just an internal node of a tree or directed acyclic graph, and vertices are the leaf nodes. A hypergraph is then just a collection of trees with common, shared nodes (that is, a given internal node or leaf may occur in several different trees). Conversely, every collection of trees can be understood as this generalized hypergraph. Since trees are widely used throughout computer science and many other branches of mathematics, one could say that hypergraphs appear naturally as well. So, for example, this generalization arises naturally as a model of term algebra; edges correspond to terms and vertices correspond to constants or variables.

For such a hypergraph, set membership then provides an ordering, but the ordering is neither a partial order nor a preorder, since it is not transitive. The graph corresponding to the Levi graph of this generalization is a directed acyclic graph. Consider, for example, the generalized hypergraph whose vertex set is $$V= \{a,b\}$$ and whose edges are $$e_1=\{a,b\}$$ and $$e_2=\{a,e_1\}$$. Then, although $$b\in e_1$$ and $$e_1\in e_2$$, it is not true that $$b\in e_2$$. However, the transitive closure of set membership for such hypergraphs does induce a partial order, and "flattens" the hypergraph into a partially ordered set.

Alternately, edges can be allowed to point at other edges, irrespective of the requirement that the edges be ordered as directed, acyclic graphs. This allows graphs with edge-loops, which need not contain vertices at all. For example, consider the generalized hypergraph consisting of two edges $$e_1$$ and $$e_2$$, and zero vertices, so that $$e_1 = \{e_2\}$$ and $$e_2 = \{e_1\}$$. As this loop is infinitely recursive, sets that are the edges violate the axiom of foundation. In particular, there is no transitive closure of set membership for such hypergraphs. Although such structures may seem strange at first, they can be readily understood by noting that the equivalent generalization of their Levi graph is no longer bipartite, but is rather just some general directed graph.

The generalized incidence matrix for such hypergraphs is, by definition, a square matrix, of a rank equal to the total number of vertices plus edges. Thus, for the above example, the incidence matrix is simply


 * $$\left[ \begin{matrix} 0 & 1 \\ 1 & 0 \end{matrix} \right].$$