Gene polymorphism

A gene is said to be polymorphic if more than one allele occupies that gene's locus within a population. In addition to having more than one allele at a specific locus, each allele must also occur in the population at a rate of at least 1% to generally be considered polymorphic.

Gene polymorphisms can occur in any region of the genome. The majority of polymorphisms are silent, meaning they do not alter the function or expression of a gene. Some polymorphisms are visible. For example, in dogs the E locus can have any of five different alleles, known as E, Em, Eg, Eh, and e. Varying combinations of these alleles contribute to the pigmentation and patterns seen in dog coats.

A polymorphic variant of a gene can lead to the abnormal expression or to the production of an abnormal form of the protein; this abnormality may cause or be associated with disease. For example, a polymorphic variant of the gene encoding the enzyme CYP4A11, in which thymidine replaces cytosine at the gene's nucleotide 8590 position encodes a CYP4A11 protein that substitutes phenylalanine with serine at the protein's amino acid position 434. This variant protein has reduced enzyme activity in metabolizing arachidonic acid to the blood pressure-regulating eicosanoid, 20-hydroxyeicosatetraenoic acid. A study has shown that humans bearing this variant in one or both of their CYP4A11 genes have an increased incidence of hypertension, ischemic stroke, and coronary artery disease.

Most notably, the genes coding for the major histocompatibility complex (MHC) are in fact the most polymorphic genes known. MHC molecules are involved in the immune system and interact with T-cells. There are more than 32,000 different alleles of human MHC class I and II genes, and it has been estimated that there are 200 variants at the HLA-B HLA-DRB1 loci alone.

Some polymorphism may be maintained by balancing selection.

Differences between gene polymorphism and mutation
A rule of thumb that is sometimes used is to classify genetic variants that occur below 1% allele frequency as mutations rather than polymorphisms. However, since polymorphisms may occur at low allele frequency, this is not a reliable way to tell new mutations from polymorphisms. A mutation is a change to an inherited genetic sequence.
 * In unicellular organisms, there isn't a distinction.
 * In multi-cellular organisms which replicate via sexual reproduction nearly all mutations are not passed on to subsequent generations. A mutation may, or may not, be passed on to off-spring (e.g. if is a mutation that happens in some replicating cells that are not part of the germline, none of the off-spring will bear the mutation.
 * For example, a mutation may occur in a skin cell as a result of ultraviolet light resulting in a thiamine dimer which is not properly repaired before the skin cell undergoes mitosis and divides.
 * This is quite distinct from a mutation which occurs during meiosis, which can be subsequently passed on to future generations, and it is very helpful to be clear when discussing mutations whether it is a somatic mutation or gemline mutation.

In the case of silent mutations there isn't a change in fitness, and the pressures responsible for Hardy-Weinberg equilibrium have no impact on the accumulation of silent polymorphisms over time. Most often, a polymorphism is variation in a single nucleotide (SNP), but also can be insertion or deletion of one or more nucleotides, changes in the number of times a short or longer sequence is repeated (both of these are common in parts of DNA that don't directly code for a protein, as are SNPs, but can have major effects on gene expression). Polymorphisms which result in a change in fitness are the grist for the mill of evolution by natural selection. All genetic polymorphisms start out as a mutation, but only if they are germline and are not lethal can they spread into a population. Polymorphisms are classified based on what happens at the level of the individual mutation in the DNA sequence (or RNA sequence in the case of RNA viruses), and what effect the mutation has on the phenotype (i.e. silent or resulting in some change in function or change in fitness). Polymorphisms are also classified based on whether the change is in the sequence of the resulting protein or in the regulation of the expression of the gene, which can occur at sites that are typically upstream and adjacent to the gene, but not always.

Identification
Polymorphisms can be identified in the laboratory using a variety of methods. Many methods employ PCR to amplify the sequence of a gene. Once amplified, polymorphisms and mutations in the sequence can be detected by DNA sequencing, either directly or after screening for variation with a method such as single strand conformation polymorphism analysis.

Types
A polymorphism can be any sequence difference. Examples include:


 * Single nucleotide polymorphisms (SNPs) are a single nucleotide changes that happen in the genome in a particular location. The single nucleotide polymorphism is the most common form of genetic variation.
 * Small-scale insertions/deletions (Indels) consist of insertions or deletions of bases in DNA.
 * Polymorphic repetitive elements. Active transposable elements can also cause polymorphism by inserting themselves in new locations. For example, repetitive elements of the Alu and LINE1 families cause polymorphisms in human genome.
 * Microsatellites are repeats of 1-6 base pairs of DNA sequence. Microsatellites are commonly used as a molecular markers especially for identifying the relationship between alleles

Clinical significance
Many different human disease result from polymorphisms. Polymorphisms also play significant role as risk factors for development of disease. Finally, polymorphisms in drug metabolism, esp. cytochrome p450 isoenzymes, proteins involved in drug transport (whether into the body, into protected areas of the body like the brain, or secreted out) as well as in specific cell surface receptor proteins alter the effect of various drugs. This is a rapidly evolving area of drug safety research. Resources such as HapMap, DbSNP,Ensembl, DNA Data Bank of Japan, DrugBank, Kyoto Encyclopedia of Genes and Genomes (KEGG), GenBank, and other parts of the International Nucleotide Sequence Database Collaboration have become crucial in Personalized medicine, bioinformatics, and pharmacogenomics.

Lung cancer
Polymorphisms have been discovered in multiple XPD exons. XPD refers to "xeroderma pigmentosum group D" and is involved in a DNA repair mechanism used during DNA replication. XPD works by cutting and removing segments of DNA that have been damaged due to things such as cigarette smoking and inhalation of other environmental carcinogens. Asp312Asn and Lys751Gln are the two common polymorphisms of XPD that result in a change in a single amino acid. This variation in Asn and Gln alleles has been related to individuals having a reduced DNA repair efficiency. Several studies have been conducted to see if this diminished capacity to repair DNA is related to an increased risk of lung cancer. These studies examined the XPD gene in lung cancer patients of varying age, gender, race, and pack-years. The studies provided mixed results, from concluding individuals who are homozygous for the Asn allele or homozygous for the Gln allele had an increased risk of developing lung cancer, to finding no statistical significance between smokers who have either allele polymorphism and their susceptibility to lung cancer. Research continues to be conducted to determine the relationship between XPD polymorphisms and lung cancer risk.

As a cornerstone of Peronalized medicine cancers, Sequence analysis is becoming increasingly important to understand the specific mutations involved in the individual's cancer, such as needed to select specific molecular targets such as mutations in various receptors, but also understanding the polymorphisms they inherited which play important roles in diagnosis, prognosis, and treatment, such as treatment of leukemia with 6-mercaptopurine where toxicity largely depends on polymorphisms in multiple different genes involved in its metabolism.

Asthma
Asthma is an inflammatory disease of the lungs and more than 100 loci have been identified as contributing to the development and severity of the condition. By using the traditional linkage analysis, these asthma correlated genes were able to be identified in small quantities using genome-wide association studies (GWAS). There have been a number of studies looking into various polymorphisms of asthma-associated genes and how those polymorphisms interact with the carrier's environment. One example is the gene CD14, which is known to have a polymorphism that is associated with increased amounts of CD14 protein as well as reduced levels of IgE serum. A study was conducted on 624 children looking at their IgE serum levels as it related to the polymorphism in CD14. The study found that IgE serum levels differed in children with the C allele in the CD14/-260 gene based on the type of allergens they regularly exposed to. Children who were in regular contact with house pets showed higher serum levels of IgE while children who were regularly exposed to stable animals showed lower serum levels of IgE. Continued research into gene-environment interactions may lead to more specialized treatment plans based on an individual's surroundings.