Phenotypic disease network (PDN)

The first phenotypic disease network was constructed by Hidalgo et al. (2009) to help understand the origins of many diseases and the links between them. Hidalgo et al. (2009) defined diseases as specific sets of phenotypes that affect one or several physiological systems, and compiled data on pairwise comorbidity correlations for more than 10,000 diseases reconstructed from over 30 million medical records. Hidalgo et al. (2009) presented their data in the form of a network with diseases as the nodes and comorbidity correlations as the links. Intuitively, the phenotypic disease network (PDN) can be seen as a map of the phenotypic space whose structure can contribute to the understanding of disease progression.

History
During the last decade, several papers were published that aim at understanding the origins and interrelatedness of diseases using the analytical tools of network science. Interactions between disease-associated genes, proteins, and gene expressions have been explored. However, phenotypic information was essentially overlooked, despite the fact that there exist extensive, high-quality data on it in the form of clinical histories, until the seminal paper of Hidalgo et al. (2009) introducing the human phenotypic disease network.

Source data
Hidalgo et al. (2009) used Medicare hospital claims based on the MedPAR records on hospitalizations for the period 1990-1993. For the 32 million elderly Americans aged 65 or older enrolled in Medicare and alive for the entire study period, there were approximately 32 million inpatient claims, belonging to about 13 million individuals. The dataset consisted of mainly white patients over 65 years old living in an industrialized country which imposed some limitations on the study; for example, many infectious diseases or pregnancy related conditions did not appear in the data at all.

Comorbidity correlation measures
Comorbidity measures are used to measure the "distance" between two diseases. The relative risk (RR) of observing disease i and j affecting the same patient is given by $$ RR_{ij}=\frac{C_{ij}N}{P_iP_j} $$ where $$C_{ij}$$ denotes the number of patients affected by both diseases, N is the total number of patients in the population, and $$P_i$$ and $$P_j$$ are the prevalences of diseases i and j, respectively. The $$\phi$$-correlation (Pearson correlation for binary measures) can be expressed as $$ \phi_{ij}=\frac{C_{ij}N-P_iP_j}{\sqrt{P_i P_j(N-P_i)(N-P_j)}}. $$ Both measures have inherent biases: RR overestimates relationships involving rare diseases and underestimates the comorbidity between highly prevalent diseases, while the $$\phi$$-correlation is accurate in describing comorbidity between diseases with similar prevalence but underestimates the comorbidity between rare and common diseases. In the PDN, nodes are disease phenotypes and links connect those phenotypes that have significant comorbidity correlation according to the RR and $$\phi$$-correlation. Considering the complementary biases of these two measures, Hidalgo et al. (2009) constructed a separate PDN for each.

Disease progression
Comparing the average correlation between illnesses diagnosed in the first two visits and those diagnosed later (during the third and fourth visits for patients with a total of four visits) to the average correlation in a randomized control case, inter-visit correlations were found to be significantly larger than those that would occur by chance alone, pointing to the fact that patients develop diseases that are close in the PDN to those they already have.

Connectedness and mortality
Hidalgo et al. (2009) also established a connection between that the mortality associated with a given disease and its connectivity in the PDN. Diseases that are preceded by others are usually more connected than those that precede others, and they tend to be more lethal. That is to say, patients that have a disease that is more connected in the network face higher mortality rates that those patients who have less connected conditions.

Demographic differences
In the Hidalgo et al. (2009) study, disease progression was found to be different across genders and ethnicities. For example, hypertension, diabetes, and renal disorders tend to be more comorbid in black males, while ischemic heart disease, infarctions, pulmonary complications and hypercholesterolemia are more comorbid in white males.