Glycoprotein



Glycoproteins are proteins which contain oligosaccharide chains covalently attached to amino acid side-chains. The carbohydrate is attached to the protein in a cotranslational or posttranslational modification. This process is known as glycosylation. Secreted extracellular proteins are often glycosylated.

In proteins that have segments extending extracellularly, the extracellular segments are also often glycosylated. Glycoproteins are also often important integral membrane proteins, where they play a role in cell–cell interactions. It is important to distinguish endoplasmic reticulum-based glycosylation of the secretory system from reversible cytosolic-nuclear glycosylation. Glycoproteins of the cytosol and nucleus can be modified through the reversible addition of a single GlcNAc residue that is considered reciprocal to phosphorylation and the functions of these are likely to be an additional regulatory mechanism that controls phosphorylation-based signalling. In contrast, classical secretory glycosylation can be structurally essential. For example, inhibition of asparagine-linked, i.e. N-linked, glycosylation can prevent proper glycoprotein folding and full inhibition can be toxic to an individual cell. In contrast, perturbation of glycan processing (enzymatic removal/addition of carbohydrate residues to the glycan), which occurs in both the endoplasmic reticulum and Golgi apparatus, is dispensable for isolated cells (as evidenced by survival with glycosides inhibitors) but can lead to human disease (congenital disorders of glycosylation) and can be lethal in animal models. It is therefore likely that the fine processing of glycans is important for endogenous functionality, such as cell trafficking, but that this is likely to have been secondary to its role in host-pathogen interactions. A famous example of this latter effect is the ABO blood group system.

Though there are different types of glycoproteins, the most common are N-linked and O-linked glycoproteins. These two types of glycoproteins are distinguished by structural differences that give them their names. Glycoproteins vary greatly in composition, making many different compounds such as antibodies or hormones. Due to the wide array of functions within the body, interest in glycoprotein synthesis for medical use has increased. There are now several methods to synthesize glycoproteins, including recombination and glycosylation of proteins.

Glycosylation is also known to occur on nucleocytoplasmic proteins in the form of O-GlcNAc.

Types of glycosylation
There are several types of glycosylation, although the first two are the most common.
 * In N-glycosylation, sugars are attached to nitrogen, typically on the amide side-chain of asparagine.
 * In O-glycosylation, sugars are attached to oxygen, typically on serine or threonine, but also on tyrosine or non-canonical amino acids such as hydroxylysine and hydroxyproline.
 * In P-glycosylation, sugars are attached to phosphorus on a phosphoserine.
 * In C-glycosylation, sugars are attached directly to carbon, such as in the addition of mannose to tryptophan.
 * In S-glycosylation, a beta-GlcNAc is attached to the sulfur atom of a cysteine residue.
 * In glypiation, a GPI glycolipid is attached to the C-terminus of a polypeptide, serving as a membrane anchor.
 * In glycation, also known as non-enzymatic glycosylation, sugars are covalently bonded to a protein or lipid molecule, without the controlling action of an enzyme, but through a Maillard reaction.

Monosaccharides
Monosaccharides commonly found in eukaryotic glycoproteins include:

The sugar group(s) can assist in protein folding, improve proteins' stability and are involved in cell signalling.

Structure
The critical structural element of all glycoproteins is having oligosaccharides bonded covalently to a protein. There are 10 common monosaccharides in mammalian glycans including: glucose (Glc), fucose (Fuc), xylose (Xyl), mannose (Man), galactose (Gal), N-acetylglucosamine (GlcNAc), glucuronic acid (GlcA), iduronic acid (IdoA), N-acetylgalactosamine (GalNAc), sialic acid, and 5-N-acetylneuraminic acid (Neu5Ac). These glycans link themselves to specific areas of the protein amino acid chain.

The two most common linkages in glycoproteins are N-linked and O-linked glycoproteins. An N-linked glycoprotein has glycan bonds to the nitrogen containing an asparagine amino acid within the protein sequence. An O-linked glycoprotein has the sugar is bonded to an oxygen atom of a serine or threonine amino acid in the protein.

Glycoprotein size and composition can vary largely, with carbohydrate composition ranges from 1% to 70% of the total mass of the glycoprotein. Within the cell, they appear in the blood, the extracellular matrix, or on the outer surface of the plasma membrane, and make up a large portion of the proteins secreted by eukaryotic cells. They are very broad in their applications and can function as a variety of chemicals from antibodies to hormones.

Glycomics
Glycomics is the study of the carbohydrate components of cells. Though not exclusive to glycoproteins, it can reveal more information about different glycoproteins and their structure. One of the purposes of this field of study is to determine which proteins are glycosylated and where in the amino acid sequence the glycosylation occurs. Historically, mass spectrometry has been used to identify the structure of glycoproteins and characterize the carbohydrate chains attached.

Examples
The unique interaction between the oligosaccharide chains have different applications. First, it aids in quality control by identifying misfolded proteins. The oligosaccharide chains also change the solubility and polarity of the proteins that they are bonded to. For example, if the oligosaccharide chains are negatively charged, with enough density around the protein, they can repulse proteolytic enzymes away from the bonded protein. The diversity in interactions lends itself to different types of glycoproteins with different structures and functions.

One example of glycoproteins found in the body is mucins, which are secreted in the mucus of the respiratory and digestive tracts. The sugars when attached to mucins give them considerable water-holding capacity and also make them resistant to proteolysis by digestive enzymes.

Glycoproteins are important for white blood cell recognition. Examples of glycoproteins in the immune system are: H antigen of the ABO blood compatibility antigens. Other examples of glycoproteins include: Soluble glycoproteins often show a high viscosity, for example, in egg white and blood plasma.
 * molecules such as antibodies (immunoglobulins), which interact directly with antigens.
 * molecules of the major histocompatibility complex (or MHC), which are expressed on the surface of cells and interact with T cells as part of the adaptive immune response.
 * sialyl Lewis X antigen on the surface of leukocytes.
 * gonadotropins (luteinizing hormone and follicle-stimulating hormone)
 * glycoprotein IIb/IIIa, an integrin found on platelets that is required for normal platelet aggregation and adherence to the endothelium.
 * components of the zona pellucida, which surrounds the oocyte, and is important for sperm-egg interaction.
 * structural glycoproteins, which occur in connective tissue. These help bind together the fibers, cells, and ground substance of connective tissue. They may also help components of the tissue bind to inorganic substances, such as calcium in bone.
 * Glycoprotein-41 (gp41) and glycoprotein-120 (gp120) are HIV viral coat proteins.


 * Miraculin, is a glycoprotein extracted from Synsepalum dulcificum a berry which alters human tongue receptors to recognize sour foods as sweet.

Variable surface glycoproteins allow the sleeping sickness Trypanosoma parasite to escape the immune response of the host.

The viral spike of the human immunodeficiency virus is heavily glycosylated. Approximately half the mass of the spike is glycosylation and the glycans act to limit antibody recognition as the glycans are assembled by the host cell and so are largely 'self'. Over time, some patients can evolve antibodies to recognise the HIV glycans and almost all so-called 'broadly neutralising antibodies (bnAbs) recognise some glycans. This is possible mainly because the unusually high density of glycans hinders normal glycan maturation and they are therefore trapped in the premature, high-mannose, state. This provides a window for immune recognition. In addition, as these glycans are much less variable than the underlying protein, they have emerged as promising targets for vaccine design.

P-glycoproteins are critical for antitumor research due to its ability block the effects of antitumor drugs. P-glycoprotein, or multidrug transporter (MDR1), is a type of ABC transporter that transports compounds out of cells. This transportation of compounds out of cells includes drugs made to be delivered to the cell, causing a decrease in drug effectiveness. Therefore, being able to inhibit this behavior would decrease P-glycoprotein interference in drug delivery, making this an important topic in drug discovery. For example, P-Glycoprotein causes a decrease in anti-cancer drug accumulation within tumor cells, limiting the effectiveness of chemotherapies used to treat cancer.

Hormones
Hormones that are glycoproteins include:
 * Follicle-stimulating hormone
 * Luteinizing hormone
 * Thyroid-stimulating hormone
 * Human chorionic gonadotropin
 * Alpha-fetoprotein
 * Erythropoietin (EPO)

Distinction between glycoproteins and proteoglycans
Quoting from recommendations for IUPAC:

"A glycoprotein is a compound containing carbohydrate (or glycan) covalently linked to protein. The carbohydrate may be in the form of a monosaccharide, disaccharide(s). oligosaccharide(s), polysaccharide(s), or their derivatives (e.g. sulfo- or phospho-substituted). One, a few, or many carbohydrate units may be present. Proteoglycans are a subclass of glycoproteins in which the carbohydrate units are polysaccharides that contain amino sugars. Such polysaccharides are also known as glycosaminoglycans."

Analysis
A variety of methods used in detection, purification, and structural analysis of glycoproteins are

Synthesis
The glycosylation of proteins has an array of different applications from influencing cell to cell communication to changing the thermal stability and the folding of proteins. Due to the unique abilities of glycoproteins, they can be used in many therapies. By understanding glycoproteins and their synthesis, they can be made to treat cancer, Crohn's Disease, high cholesterol, and more.

The process of glycosylation (binding a carbohydrate to a protein) is a post-translational modification, meaning it happens after the production of the protein. Glycosylation is a process that roughly half of all human proteins undergo and heavily influences the properties and functions of the protein. Within the cell, glycosylation occurs in the endoplasmic reticulum.

Recombination
There are several techniques for the assembly of glycoproteins. One technique utilizes recombination. The first consideration for this method is the choice of host, as there are many different factors that can influence the success of glycoprotein recombination such as cost, the host environment, the efficacy of the process, and other considerations. Some examples of host cells include E. coli, yeast, plant cells, insect cells, and mammalian cells. Of these options, mammalian cells are the most common because their use does not face the same challenges that other host cells do such as different glycan structures, shorter half life, and potential unwanted immune responses in humans. Of mammalian cells, the most common cell line used for recombinant glycoprotein production is the Chinese hamster ovary line. However, as technologies develop, the most promising cell lines for recombinant glycoprotein production are human cell lines.

Glycosylation
The formation of the link between the glycan and the protein is key element of the synthesis of glycoproteins. The most common method of glycosylation of N-linked glycoproteins is through the reaction between a protected glycan and a protected Asparagine. Similarly, an O-linked glycoprotein can be formed through the addition of a glycosyl donor with a protected Serine or Threonine. These two methods are examples of natural linkage. However, there are also methods of unnatural linkages. Some methods include ligation and a reaction between a serine-derived sulfamidate and thiohexoses in water. Once this linkage is complete, the amino acid sequence can be expanded upon using solid-phase peptide synthesis.