Giant component

In network theory, a giant component is a connected component of a given random graph that contains a significant fraction of the entire graph's vertices.

More precisely, in graphs drawn randomly from a probability distribution over arbitrarily large graphs, a giant component is a connected component whose fraction of the overall number of vertices is bounded away from zero. In sufficiently dense graphs distributed according to the Erdős–Rényi model, a giant component exists with high probability.

Giant component in Erdős–Rényi model
Giant components are a prominent feature of the Erdős–Rényi model (ER) of random graphs, in which each possible edge connecting pairs of a given set of $n$ vertices is present, independently of the other edges, with probability $p$. In this model, if $$p \le \frac{1-\epsilon}{n}$$ for any constant $$\epsilon>0$$, then with high probability (in the limit as $$n$$ goes to infinity) all connected components of the graph have size $O(log n)$, and there is no giant component. However, for $$p \ge \frac{1 + \epsilon}{n}$$ there is with high probability a single giant component, with all other components having size $O(log n)$. For $$p=p_c = \frac{1}{n}$$, intermediate between these two possibilities, the number of vertices in the largest component of the graph, $$P_{\inf}$$ is with high probability proportional to $$n^{2/3}$$.

Giant component is also important in percolation theory. When a fraction of nodes, $$q=1-p$$, is removed randomly from an ER network of degree $$\langle k \rangle$$, there exists a critical threshold, $$p_c= \frac{1}{\langle k \rangle}$$. Above $$p_c$$ there exists a giant component (largest cluster) of size, $$P_{\inf}$$. $$P_{\inf}$$ fulfills, $$P_{\inf}=p(1-\exp(-\langle k \rangle P_{\inf}))$$. For $$p<p_c$$ the solution of this equation is $$P_{\inf}=0$$, i.e., there is no giant component.

At $$p_c$$, the distribution of cluster sizes behaves as a power law, $$n(s)$$~$$s^{-5/2}$$ which is a feature of phase transition.

Alternatively, if one adds randomly selected edges one at a time, starting with an empty graph, then it is not until approximately $$n/2$$ edges have been added that the graph contains a large component, and soon after that the component becomes giant. More precisely, when $t$ edges have been added, for values of $t$ close to but larger than $$n/2$$, the size of the giant component is approximately $$4t-2n$$. However, according to the coupon collector's problem, $$\Theta(n\log n)$$ edges are needed in order to have high probability that the whole random graph is connected.

Graphs with arbitrary degree distributions
A similar sharp threshold between parameters that lead to graphs with all components small and parameters that lead to a giant component also occurs in random graphs with non-uniform degree distributions. The degree distribution does not define a graph uniquely. However under assumption that in all respects other than their degree distribution, the graphs are treated as entirely random, many results on finite/infinite-component sizes are known. In this model, the existence of the giant component depends only on the first two (mixed) moments of the degree distribution. Let a randomly chosen vertex have degree $$k $$, then the giant component exists if and only if$$\mathbb E [k^2] - 2 \mathbb E [k]>0.$$$$ \mathbb E [k] $$ which is also written in the form of $$ {\langle k \rangle} $$ is the mean degree of the network. Similar expressions are also valid for directed graphs, in which case the degree distribution is two-dimensional. There are three types of connected components in directed graphs. For a randomly chosen vertex:


 * 1) out-component is a set of vertices that can be reached by recursively following all out-edges forward;
 * 2) in-component is a set of vertices that can be reached by recursively following all in-edges backward;
 * 3) weak component is a set of vertices that can be reached by recursively following all edges regardless of their direction.

Criteria for giant component existence in directed and undirected configuration graphs
Let a randomly chosen vertex has $$k_\text{in} $$ in-edges and $$k_\text{out} $$ out edges. By definition, the average number of in- and out-edges coincides so that $$c = \mathbb E [k_\text{in}] =\mathbb E [k_\text{out}] $$. If $$ G_0(x) = \textstyle \sum_{k} \displaystyle P(k)x^k $$ is the generating function of the degree distribution $$ P(k) $$ for an undirected network, then $$ G_1(x)

$$ can be defined as $$ G_1(x) = \textstyle \sum_{k} \displaystyle \frac{k}{\langle k \rangle}P(k)x^{k-1} $$. For directed networks, generating function assigned to the joint probability distribution $$ P(k_{in}, k_{out}) $$ can be written with two valuables $$ x $$ and $$ y $$ as: $$ \mathcal{G}(x,y) = \sum_{k_{in},k_{out}} \displaystyle P({k_{in},k_{out}})x^{k_{in}}y^{k_{out}} $$, then one can define $$ g(x) = \frac{1}{c} {\partial \mathcal{G}\over\partial x}\vert _{y=1} $$ and $$ f(y) = \frac{1}{c} {\partial \mathcal{G}\over\partial y}\vert _{x=1} $$. The criteria for giant component existence in directed and undirected random graphs are given in the table below: