Hierarchical network model

Hierarchical network models are iterative algorithms for creating networks which are able to reproduce the unique properties of the scale-free topology and the high clustering of the nodes at the same time. These characteristics are widely observed in nature, from biology to language to some social networks.

Concept
The hierarchical network model is part of the scale-free model family sharing their main property of having proportionally more hubs among the nodes than by random generation; however, it significantly differs from the other similar models (Barabási–Albert, Watts–Strogatz) in the distribution of the nodes' clustering coefficients: as other models would predict a constant clustering coefficient as a function of the degree of the node, in hierarchical models nodes with more links are expected to have a lower clustering coefficient. Moreover, while the Barabási-Albert model predicts a decreasing average clustering coefficient as the number of nodes increases, in the case of the hierarchical models there is no relationship between the size of the network and its average clustering coefficient.

The development of hierarchical network models was mainly motivated by the failure of the other scale-free models in incorporating the scale-free topology and high clustering into one single model. Since several real-life networks (metabolic networks, the protein interaction network, the World Wide Web or some social networks) exhibit such properties, different hierarchical topologies were introduced in order to account for these various characteristics.

Algorithm
Hierarchical network models are usually derived in an iterative way by replicating the initial cluster of the network according to a certain rule. For instance, consider an initial network of five fully interconnected nodes (N=5). As a next step, create four replicas of this cluster and connect the peripheral nodes of each replica to the central node of the original cluster (N=25). This step can be repeated indefinitely, thereby for any k steps the number of nodes in the system can be derived by N=5k+1.

Of course there have been several different ways for creating hierarchical systems proposed in the literature. These systems generally differ in the structure of the initial cluster as well as in the degree of expansion which is often referred to as the replication factor of the model.



Degree distribution
Being part of the scale-free model family, the degree distribution of the hierarchical network model follows the power law meaning that a randomly selected node in the network has k edges with a probability


 * $$P\left(k\right)\sim ck^{-\gamma} \, $$

where c is a constant and &gamma; is the degree exponent. In most real world networks exhibiting scale-free properties &gamma; lies in the interval [2,3].

As a specific result for hierarchical models it has been shown that the degree exponent of the distribution function can be calculated as


 * $$\gamma=1+\frac{\ln M}{\ln (M-1)} $$

where M represents the replication factor of the model.

Clustering coefficient
In contrast to the other scale-free models (Erdős–Rényi, Barabási–Albert, Watts–Strogatz) where the clustering coefficient is independent of the degree of a specific node, in hierarchical networks the clustering coefficient can be expressed as a function of the degree in the following way:


 * $$C\left(k\right)\sim k^{-\beta} \, $$

It has been analytically shown that in deterministic scale-free networks the exponent β takes the value of 1.

Actor network
Based on the actor database available at www.IMDb.com the network is defined by Hollywood actors who are connected to each other if they both appeared in the same movie, resulting in a data set of 392,340 nodes and 15,347,957 edges. As earlier studies have shown, this network exhibits scale-free properties at least for high values of k. Moreover, the clustering coefficients seem to follow the required scaling law with the parameter -1 providing evidence for the hierarchical topology of the network. Intuitively, one-performance actors have by definition a clustering coefficient of one while actors starring in several movies are highly unlikely to work with the same crew which in general results in a decreasing clustering coefficient as the number of co-stars grows.

Language network
Words can be regarded as network if one specifies the linkage criteria between them. Defining links as appearance as a synonym in the Merriam-Webster dictionary a semantic web of 182,853 nodes with 317,658 edges was constructed. As it turned out, the obtained network of words indeed follows a power law in its degree distribution while the distribution of the clustering coefficient indicates that the underlying web follows a hierarchical structure with γ=3.25 and β=1.

Network of webpages
By mapping the www.nd.edu domain a network of 325,729 nodes and 1,497,135 edges was obtained whose degree distribution followed a power law with γout=2.45 and γin=2.1 for the out- and in-degrees, respectively. The evidence for the scaling law distribution of the clustering coefficients is significantly weaker than in the previous cases although there is a clearly visible declining pattern in the distribution of C(k) indicating that the more links a domain has the less interconnected the linked/linking web pages are.

Domain network
The domain network, i.e. the internet at the autonomous system (AS) level where the administrative domains are said to be connected in case there is a router which connects them, was found to comprise 65,520 nodes and 24,412 links between them and exhibit the properties of a scale-free network. The sample distribution of the clustering coefficients was fitted by the scaling function C(k)~k−0.75 whose exponent is (in absolute terms) somewhat smaller than the theoretical parameter for deterministic scale-free networks.