H5N1 genetic structure

Influenza A virus subtype H5N1 (A/H5N1) is a subtype of the influenza A virus, which causes influenza (flu), predominantly in birds. It is enzootic (maintained in the population) in many bird populations, and also panzootic (affecting animals of many species over a wide area). A/H5N1 virus can also infect mammals (including humans) that have been exposed to infected birds; in these cases, symptoms are frequently severe or fatal.

A/H5N1 virus is shed in the saliva, mucous, and feces of infected birds; other infected animals may shed bird flu viruses in respiratory secretions and other body fluids (such as milk). The virus can spread rapidly through poultry flocks and among wild birds. An estimated half a billion farmed birds have been slaughtered in efforts to contain the virus.

Symptoms of A/H5N1 influenza vary according to both the strain of virus underlying the infection and on the species of bird or mammal affected. Classification as either Low Pathogenic Avian Influenza (LPAI) or High Pathogenic Avian Influenza (HPAI) is based on the severity of symptoms in domestic chickens and does not predict the severity of symptoms in other species. Chickens infected with LPAI A/H5N1 virus display mild symptoms or are asymptomatic, whereas HPAI A/H5N1 causes serious breathing difficulties, a significant drop in egg production, and sudden death.

In mammals, including humans, A/H5N1 influenza (whether LPAI or HPAI) is rare. Symptoms of infection vary from mild to severe, including fever, diarrhoea, and cough. Human infections with A/H5N1 virus have been reported in 23 countries since 1997, resulting in severe pneumonia and death in about 50% of cases.

A/H5N1 influenza virus was first identified in farmed birds in southern China in 1996. Between 1996 and 2018, A/H5N1 coexisted in bird populations with other subtypes of the virus, but since then, the highly pathogenic subtype HPAI A(H5N1) has become the dominant strain in bird populations worldwide. Some strains of A/H5N1 which are highly pathogenic to chickens have adapted to cause mild symptoms in ducks and geese, and are able to spread rapidly through bird migration. Mammal species that have been recorded with H5N1 infection include cows, seals, goats, and skunks.

Due to the high lethality and virulence of HPAI A(H5N1), its worldwide presence, its increasingly diverse host reservoir, and its significant ongoing mutations, the H5N1 virus is regarded as the world's largest pandemic threat. Domestic poultry may potentially be protected from specific strains of the virus by vaccination. In the event of a serious outbreak of H5N1 flu among humans, health agencies have prepared "candidate" vaccines that may be used to prevent infection and control the outbreak; however, it could take several months to ramp up mass production.

Terminology
The Orthomyxovirus family consists of 7 genera:
 * Alphainfluenzavirus
 * Betainfluenzavirus
 * Gammainfluenzavirus
 * Deltainfluenzavirus
 * Isavirus
 * Quaranjavirus
 * Thogotovirus

The "RNA viruses" include the "negative-sense ssRNA viruses" which include the Family "Orthomyxoviridae" which contains five genera, classified by variations in nucleoprotein (NP and M) antigens. One of these is the Genus "Influenzavirus A" which consists of a single species called "Influenza A virus"; one of its subtypes is H5N1.

Context
A virus is one type of microscopic parasite that infects cells in biological organisms.
 * Virus

The Orthomyxoviridae are a family of RNA viruses which infect vertebrates. It includes those viruses which cause influenza. Viruses of this family contain 7 to 8 segments of linear negative-sense single-stranded RNA.
 * Orthomyxoviridae

"Influenza virus" refers to a subset of Orthomyxoviridae that create influenza. This taxonomic category is not based on phylogenetics.
 * Influenza virus

Influenza A viruses have 10 genes on eight separate RNA molecules, which, for the reasons mentioned above, are named PB2, PB1, PA, HA, NP, NA, M, and NS. HA, NA, and M specify the structure of proteins that are most medically relevant as targets for antiviral drugs and antibodies. (An eleventh recently discovered gene called PB1-F2 sometimes creates a protein but is absent from some influenza virus isolates. ) This segmentation of the influenza genome facilitates genetic recombination by segment reassortment in hosts who are infected with two different influenza viruses at the same time. Influenza A virus is the only species in the Influenzavirus A genus of the family Orthomyxoviridae and are negative sense, single-stranded, segmented RNA viruses.
 * Influenza A virus

"The influenza virus RNA polymerase is a multifunctional complex composed of the three viral proteins PB1, PB2 and PA, which, together with the viral nucleoprotein NP, form the minimum complement required for viral mRNA synthesis and replication."

Genome
Influenza A viruses have 11 genes on eight separate RNA segments:
 * PB2 (polymerase basic 2)
 * PB1 (polymerase basic 1)
 * PB1-F2 (alternate open reading frame near the 5' end of the PB1 gene)
 * PA (polymerase acidic)
 * HA (hemagglutinin)
 * NP (nucleoprotein)
 * NA (neuraminidase)
 * M1 and M2 (matrix)
 * NS1 (non-structural)
 * NEP/NS2 (nuclear export of vRNPs)

Two of the most important RNA molecules are HA and PB1. HA creates a surface antigen that is especially important in transmissibility. PB1 creates a viral polymerase molecule that is especially important in virulence.

The HA RNA molecule contains the HA gene, which codes for hemagglutinin, which is an antigenic glycoprotein found on the surface of the influenza viruses and is responsible for binding the virus to the cell that is being infected. Hemagglutinin forms spikes at the surface of flu viruses that function to attach viruses to cells. This attachment is required for efficient transfer of flu virus genes into cells, a process that can be blocked by antibodies that bind to the hemagglutinin proteins.

One genetic factor in distinguishing between human flu viruses and avian flu viruses is that avian influenza HA bind alpha 2-3 sialic acid receptors while human influenza HA bind alpha 2-6 sialic acid receptors. Swine influenza viruses have the ability to bind both types of sialic acid receptors. Humans have avian-type receptors at very low densities and chickens have human-type receptors at very low densities. Some isolates taken from H5N1-infected human have been observed to have HA mutations at positions 182, 192, 223, 226, or 228 and these mutations have been shown to influence the selective binding of the virus to those previously mentioned sialic acid avian and/or human cell surface receptors. These are the types of mutations that can change a bird flu virus into a flu pandemic virus.

A 2008 virulence study that mated in a laboratory an avian flu H5N1 virus that circulated in Thailand in 2004 and a human flu H3N2 virus recovered in Wyoming in 2003 produced 63 viruses representing various potential combinations of human and avian influenza A virus genes. One in five were lethal to mice at low doses. The virus that most closely matched H5N1 for virulence was one with the hemagglutinin (HA), the neuraminidase (NA) and the PB1 avian flu virus RNA molecules with their genes combined with the remaining five RNA molecules (PB2, PA, NP, M, and NS) with their genes from the human flu virus. Both the viruses from the 1957 pandemic and 1968 pandemic carried an avian flu virus PB1 gene. The authors suggest that picking up an avian flu virus PB1 gene may be a critical step in a potential flu pandemic virus arising through reassortment."

PB1 codes for the PB1 protein and the PB1-F2 protein. The PB1 protein is a critical component of the viral polymerase. The PB1-F2 protein is encoded by an alternative open reading frame of the PB1 RNA segment and "interacts with 2 components of the mitochondrial permeability transition pore complex, ANT3 and VDCA1, [sensitizing] cells to apoptosis. [...] PB1-F2 likely contributes to viral pathogenicity and might have an important role in determining the severity of pandemic influenza." This was discovered by Chen et al. and reported in Nature. "After comparing viruses from the Hong Kong 1997 H5N1 outbreak, one amino acid change (N66S) was found in the PB1-F2 sequence at position 66 that correlated with pathogenicity. This same amino acid change (N66S) was also found in the PB1-F2 protein of the 1918 pandemic A/Brevig Mission/18 virus."

Surface encoding gene segments

 * Surface antigen encoding gene segments (RNA molecule): (HA, NA)

HA
HA codes for hemagglutinin, which is an antigenic glycoprotein found on the surface of the influenza viruses and is responsible for binding the virus to the cell that is being infected. Hemagglutinin forms spikes at the surface of flu viruses that function to attach viruses to cells. This attachment is required for efficient transfer of flu virus genes into cells, a process that can be blocked by antibodies that bind to the hemagglutinin proteins. One genetic factor in distinguishing between human flu viruses and avian flu viruses is that "avian influenza HA bind alpha 2-3 sialic acid receptors while human influenza HA bind alpha 2-6 sialic acid receptors. Swine influenza viruses have the ability to bind both types of sialic acid receptors."

A mutation found in Turkey in 2006 "involves a substitution in one sample of an amino acid at position 223 of the haemoagglutinin receptor protein. This protein allows the flu virus to bind to the receptors on the surface of its host's cells. This mutation has been observed twice before — in a father and son in Hong Kong in 2003, and in one fatal case in Vietnam last year. It increases the virus's ability to bind to human receptors, and decreases its affinity for poultry receptors, making strains with this mutation better adapted to infecting humans." Another mutation in the same sample at position 153 has as yet unknown effects.

Recent research reveals that humans have avian type receptors at very low densities and chickens have human type receptors at very low densities. Researchers "found that the mutations at two places in the gene, identified as 182 and 192, allow the virus to bind to both bird and human receptors." See research articles Host Range Restriction and Pathogenicity in the Context of Influenza Pandemic (Centers for Disease Control and Prevention, 2006) (by Gabriele Neumann and Yoshihiro Kawaoka) and Structure and Receptor Specificity of the Hemagglutinin from an H5N1 Influenza Virus (American Association for the Advancement of Science, 2006) (by James Stevens, Ola Blixt, Terrence M. Tumpey, Jeffery K. Taubenberger, James C. Paulson, Ian A. Wilson) for further details.

NA
NA codes for neuraminidase which is an antigenic glycoprotein enzyme found on the surface of the influenza viruses. It helps the release of progeny viruses from infected cells. Flu drugs Tamiflu and Relenza work by inhibiting some strains of neuraminidase. They were developed based on N2 and N9. "In the N1 form of the protein, a small segment called the 150-loop is inverted, creating a hollow pocket that does not exist in the N2 and N9 proteins. [...] When the researchers looked at how existing drugs interacted with the N1 protein, they found that, in the presence of neuraminidase inhibitors, the loop changed its conformation to one similar to that in the N2 and N9 proteins."

Internal encoding gene segments

 * Internal viral protein encoding gene segments (RNA molecule): (M, NP, NS, PA, PB1, PB2)

Matrix encoding gene segments

 * M codes for the matrix proteins (M1 and M2) that, along with the two surface proteins (hemagglutinin and neuraminidase), make up the capsid (protective coat) of the virus. It encodes by using different reading frames from the same RNA segment.
 * M1 is a protein that binds to the viral RNA.
 * M2 is a protein that uncoats the virus, thereby exposing its contents (the eight RNA segments) to the cytoplasm of the host cell. The M2 transmembrane protein is an ion channel required for efficient infection. The amino acid substitution (Ser31Asn) in M2 some H5N1 genotypes is associated with amantadine resistance.

Nucleoprotein encoding gene segments.

 * NP codes for nucleoprotein.
 * NS: NS codes for two nonstructural proteins (NS1 and NS2 - formerly called NEP). "[T]he pathogenicity of influenza virus was related to the nonstructural (NS) gene of the H5N1/97 virus".
 * NS1: Non-structural: nucleus; effects on cellular RNA transport, splicing, translation. Anti-interferon protein. The "NS1 of the highly pathogenic avian H5N1 viruses circulating in poultry and waterfowl in Southeast Asia might be responsible for an enhanced proinflammatory cytokine response (especially TNFa) induced by these viruses in human macrophages". H5N1 NS1 is characterized by a single amino acid change at position 92. By changing the amino acid from glutamic acid to aspartic acid, the researchers were able to abrogate the effect of the H5N1 NS1. [This] single amino acid change in the NS1 gene greatly increased the pathogenicity of the H5N1 influenza virus."
 * NEP: The "nuclear export protein (NEP, formerly referred to as the NS2 protein) mediates the export of vRNPs".

Polymerase encoding gene segments

 * PA codes for the PA protein which is a critical component of the viral polymerase.
 * PB1 codes for the PB1 protein and the PB1-F2 protein.
 * The PB1 protein is a critical component of the viral polymerase.
 * The PB1-F2 protein is encoded by an alternative open reading frame of the PB1 RNA segment and "interacts with 2 components of the mitochondrial permeability transition pore complex, ANT3 and VDCA1, [sensitizing] cells to apoptosis. [...] PB1-F2 likely contributes to viral pathogenicity and might have an important role in determining the severity of pandemic influenza." This was discovered by Chen et al. and reported in Nature. "After comparing viruses from the Hong Kong 1997 H5N1 outbreak, one amino acid change (N66S) was found in the PB1-F2 sequence at position 66 that correlated with pathogenicity. This same amino acid change (N66S) was also found in the PB1-F2 protein of the 1918 pandemic A/Brevig Mission/18 virus."
 * PB2 codes for the PB2 protein which is a critical component of the viral polymerase. As of 2005, 75% of H5N1 human virus isolates from Vietnam had a mutation consisting of Lysine at residue 627 in the PB2 protein; which is believed to cause high levels of virulence. Until H5N1, all known avian influenza viruses had a Glu at position 627, while all human influenza viruses had a lysine. As of 2007, "The emergence of 3 (or more) substrains from the EMA [EMA=Europe, Middle East, Africa] clade represents multiple new opportunities for avian influenza (H5N1) to evolve into a human pandemic strain. In contrast to strains circulating in Southeast Asia, EMA viruses are derived from a progenitor that has the PB2 627K mutation. These viruses are expected to have enhanced replication characteristics in mammals, and indeed the spread of EMA has coincided with the rapid appearance of cases in mammals—including humans in Turkey, Egypt, Iraq, and Djibouti, and cats in Germany, Austria, and Iraq. Unfortunately, the EMA-type viruses appear to be as virulent as the exclusively Asian strains: of 34 human infections outside of Asia through mid-2006, 15 have been fatal." Lys at PB2–627 is believed to confer to avian H5N1 viruses the advantage of efficient growth in the upper and lower respiratory tracts of mammals.

Mutation
Influenza viruses have a relatively high mutation rate that is characteristic of RNA viruses. The segmentation of the influenza genome facilitates genetic recombination by segment reassortment in hosts who are infected with two different influenza viruses at the same time. H5N1 viruses can reassort genes with other strains that co-infect a host organism, such as a pig, bird, or human, and mutate into a form that can pass easily among humans. This is one of many possible paths to a pandemic.

The ability of various influenza strains to show species-selectivity is largely due to variation in the hemagglutinin genes. Genetic mutations in the hemagglutinin gene that cause single amino acid substitutions can significantly alter the ability of viral hemagglutinin proteins to bind to receptors on the surface of host cells. Such mutations in avian H5N1 viruses can change virus strains from being inefficient at infecting human cells to being as efficient in causing human infections as more common human influenza virus types. This doesn't mean that one amino acid substitution can cause a pandemic, but it does mean that one amino acid substitution can cause an avian flu virus that is not pathogenic in humans to become pathogenic in humans.

H3N2 ("swine flu") is endemic in pigs in China, and has been detected in pigs in Vietnam, increasing fears of the emergence of new variant strains. The dominant strain of annual flu virus in January 2006 was H3N2, which is now resistant to the standard antiviral drugs amantadine and rimantadine. The possibility of H5N1 and H3N2 exchanging genes through reassortment is a major concern. If a reassortment in H5N1 occurs, it might remain an H5N1 subtype, or it could shift subtypes, as H2N2 did when it evolved into the Hong Kong Flu strain of H3N2.

Both the H2N2 and H3N2 pandemic strains contained avian influenza virus RNA segments. "While the pandemic human influenza viruses of 1957 (H2N2) and 1968 (H3N2) clearly arose through reassortment between human and avian viruses, the influenza virus causing the 'Spanish flu' in 1918 appears to be entirely derived from an avian source".

In July 2004, researchers led by H. Deng of the Harbin Veterinary Research Institute, Harbin, China and Professor Robert G. Webster of the St. Jude Children's Research Hospital, Memphis, Tennessee, reported results of experiments in which mice had been exposed to 21 isolates of confirmed H5N1 strains obtained from ducks in China between 1999 and 2002. They found "a clear temporal pattern of progressively increasing pathogenicity". Results reported by Dr. Webster in July 2005 reveal further progression toward pathogenicity in mice and longer virus shedding by ducks.

Asian lineage HPAI A(H5N1) is divided into two antigenic clades. "Clade 1 includes human and bird isolates from Vietnam, Thailand, and Cambodia and bird isolates from Laos and Malaysia. Clade 2 viruses were first identified in bird isolates from China, Indonesia, Japan, and South Korea before spreading westward to the Middle East, Europe, and Africa. The clade 2 viruses have been primarily responsible for human H5N1 infections that have occurred during late 2005 and 2006, according to WHO. Genetic analysis has identified six subclades of clade 2, three of which have a distinct geographic distribution and have been implicated in human infections: Map
 * Subclade 1, Indonesia
 * Subclade 2, Europe, Middle East, and Africa (called EMA)
 * Subclade 3, China"

A 2007 study focused on the EMA subclade has shed further light on the EMA mutations. "The 36 new isolates reported here greatly expand the amount of whole-genome sequence data available from recent avian influenza (H5N1) isolates. Before our project, GenBank contained only 5 other complete genomes from Europe for the 2004–2006 period, and it contained no whole genomes from the Middle East or northern Africa. Our analysis showed several new findings. First, all European, Middle Eastern, and African samples fall into a clade that is distinct from other contemporary Asian clades, all of which share common ancestry with the original 1997 Hong Kong strain. Phylogenetic trees built on each of the 8 segments show a consistent picture of 3 lineages, as illustrated by the HA tree shown in Figure 1. Two of the clades contain exclusively Vietnamese isolates; the smaller of these, with 5 isolates, we label V1; the larger clade, with 9 isolates, is V2. The remaining 22 isolates all fall into a third, clearly distinct clade, labeled EMA, which comprises samples from Europe, the Middle East, and Africa. Trees for the other 7 segments display a similar topology, with clades V1, V2, and EMA clearly separated in each case. Analyses of all available complete influenza (H5N1) genomes and of 589 HA sequences placed the EMA clade as distinct from the major clades circulating in People's Republic of China, Indonesia, and Southeast Asia."

See https://web.archive.org/web/20090709040039/http://who.int/csr/disease/avian_influenza/H5CompleteTree.pdf for a Genetic Tree of 1,342 H5N1 viruses based on their HA gene, showing their clade designations.