Glycan

The terms glycans and polysaccharides are defined by IUPAC as synonyms meaning "compounds consisting of a large number of monosaccharides linked glycosidically". However, in practice the term glycan may also be used to refer to the carbohydrate portion of a glycoconjugate, such as a glycoprotein, glycolipid, or a proteoglycan, even if the carbohydrate is only an oligosaccharide. Glycans usually consist solely of O-glycosidic linkages of monosaccharides. For example, cellulose is a glycan (or, to be more specific, a glucan) composed of β-1,4-linked D -glucose, and chitin is a glycan composed of β-1,4-linked N-acetyl- D -glucosamine. Glycans can be homo- or heteropolymers of monosaccharide residues, and can be linear or branched.

Glycans and proteins
Glycans can be found attached to proteins as in glycoproteins and proteoglycans. In general, they are found on the exterior surface of cells. O- and N-linked glycans are very common in eukaryotes but may also be found, although less commonly, in prokaryotes.

Introduction
N-Linked glycans are attached in the endoplasmic reticulum to the nitrogen (N) in the side chain of asparagine (Asn) in the sequon. The sequon is an Asn-X-Ser or Asn-X-Thr sequence, where X is any amino acid except proline and the glycan may be composed of N-acetylgalactosamine, galactose, neuraminic acid, N-acetylglucosamine, fucose, mannose, and other monosaccharides.

Assembly
In eukaryotes, N-linked glycans are derived from a core 14-sugar unit assembled in the cytoplasm and endoplasmic reticulum. First, two N-acetylglucosamine residues are attached to dolichol monophosphate, a lipid, on the external side of the endoplasmic reticulum membrane. Five mannose residues are then added to this structure. At this point, the partially finished core glycan is flipped across the endoplasmic reticulum membrane, so that it is now located within the reticular lumen. Assembly then continues within the endoplasmic reticulum, with the addition of four more mannose residues. Finally, three glucose residues are added to this structure. Following full assembly, the glycan is transferred en bloc by the glycosyltransferase oligosaccharyltransferase to a nascent peptide chain, within the reticular lumen. This core structure of N-linked glycans, thus, consists of 14 residues (3 glucose, 9 mannose, and 2 N-acetylglucosamine).

Image: https://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=glyco.figgrp.469

Dark squares are N-acetylglucosamine; light circles are mannose; dark triangles are glucose.

Processing, modification, and diversity
Once transferred to the nascent peptide chain, N-linked glycans, in general, undergo extensive processing reactions, whereby the three glucose residues are removed, as well as several mannose residues, depending on the N-linked glycan in question. The removal of the glucose residues is dependent on proper protein folding. These processing reactions occur in the Golgi apparatus. Modification reactions may involve the addition of a phosphate or acetyl group onto the sugars, or the addition of new sugars, such as neuraminic acid. Processing and modification of N-linked glycans within the Golgi does not follow a linear pathway. As a result, many different variations of N-linked glycan structure are possible, depending on enzyme activity in the Golgi.

Functions and importance
N-linked glycans are extremely important in proper protein folding in eukaryotic cells. Chaperone proteins in the endoplasmic reticulum, such as calnexin and calreticulin, bind to the three glucose residues present on the core N-linked glycan. These chaperone proteins then serve to aid in the folding of the protein that the glycan is attached to. Following proper folding, the three glucose residues are removed, and the glycan moves on to further processing reactions. If the protein fails to fold properly, the three glucose residues are reattached, allowing the protein to re-associate with the chaperones. This cycle may repeat several times until a protein reaches its proper conformation. If a protein repeatedly fails to properly fold, it is excreted from the endoplasmic reticulum and degraded by cytoplasmic proteases.

N-linked glycans also contribute to protein folding by steric effects. For example, cysteine residues in the peptide may be temporarily blocked from forming disulfide bonds with other cysteine residues, due to the size of a nearby glycan. Therefore, the presence of a N-linked glycan allows the cell to control which cysteine residues will form disulfide bonds.

N-linked glycans also play an important role in cell-cell interactions. For example, tumour cells make N-linked glycans that are abnormal. These are recognized by the CD337 receptor on Natural Killer cells as a sign that the cell in question is cancerous.

Within the immune system the N-linked glycans on an immune cell's surface will help dictate that migration pattern of the cell, e.g. immune cells that migrate to the skin have specific glycosylations that favor homing to that site. The glycosylation patterns on the various immunoglobulins including IgE, IgM, IgD, IgE, IgA, and IgG bestow them with unique effector functions by altering their affinities for Fc and other immune receptors. Glycans may also be involved in "self" and "non self" discrimination, which may be relevant to the pathophysiology of various autoimmune diseases; including rheumatoid arthritis and type 1 diabetes.

The targeting of degradative lysosomal enzymes is also accomplished by N-linked glycans. The modification of an N-linked glycan with a mannose-6-phosphate residue serves as a signal that the protein to which this glycan is attached should be moved to the lysosome. This recognition and trafficking of lysosomal enzymes by the presence of mannose-6-phosphate is accomplished by two proteins: CI-MPR (cation-independent mannose-6-phosphate receptor) and CD-MPR (cation-dependent mannose-6-phosphate receptor).

Introduction
In eukaryotes, O-linked glycans are assembled one sugar at a time on a serine or threonine residue of a peptide chain in the Golgi apparatus. Unlike N-linked glycans, there is no known consensus sequence yet. However, the placement of a proline residue at either -1 or +3 relative to the serine or threonine is favourable for O-linked glycosylation.

Assembly
The first monosaccharide attached in the synthesis of O-linked glycans is N-acetyl-galactosamine. After this, several different pathways are possible. A Core 1 structure is generated by the addition of galactose. A Core 2 structure is generated by the addition of N-acetyl-glucosamine to the N-acetyl-galactosamine of the Core 1 structure. Core 3 structures are generated by the addition of a single N-acetyl-glucosamine to the original N-acetyl-galactosamine. Core 4 structures are generated by the addition of a second N-acetyl-glucosamine to the Core 3 structure. Other core structures are possible, though less common.

Images:

https://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=glyco.figgrp.561 : Core 1 and Core 2 generation. White square = N-acetyl-galactosamine; black circle = galactose; Black square = N-acetyl-glucosamine. Note: There is a mistake in this diagram. The bottom square should always be white in each image, not black.

https://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=glyco.figgrp.562 : Core 3 and Core 4 generation.

A common structural theme in O-linked glycans is the addition of polylactosamine units to the various core structures. These are formed by the repetitive addition of galactose and N-acetyl-glucosamine units. Polylactosamine chains on O-linked glycans are often capped by the addition of a sialic acid residue (similar to neuraminic acid). If a fucose residue is also added, to the next to penultimate residue, a Sialyl-Lewis X (SLex) structure is formed.

Functions and importance
Sialyl lewis x is important in ABO blood antigen determination.

SLex is also important to proper immune response. P-selectin release from Weibel-Palade bodies, on blood vessel endothelial cells, can be induced by a number of factors. One such factor is the response of the endothelial cell to certain bacterial molecules, such as peptidoglycan. P-selectin binds to the SLex structure that is present on neutrophils in the bloodstream and helps to mediate the extravasation of these cells into the surrounding tissue during infection.

O-linked glycans, in particular mucin, have been found to be important in developing normal intestinal microflora. Certain strains of intestinal bacteria bind specifically to mucin, allowing them to colonize the intestine.

Examples of O-linked glycoproteins are:
 * Glycophorin, a protein in erythrocyte cell membranes
 * Mucin, a protein in saliva involved in formation of dental plaque
 * Notch, a transmembrane receptor involved in development and cell fate decisions
 * Thrombospondin
 * Factor VII
 * Factor IX
 * Urinary type plasminogen activator

Glycosaminoglycans
Another type of cellular glycan is the glycosaminoglycans (GAGs). These comprise 2-aminosugars linked in an alternating fashion with uronic acids, and include polymers such as heparin, heparan sulfate, chondroitin, keratan and dermatan. Some glycosaminoglycans, such as heparan sulfate, are found attached to the cell surface, where they are linked through a tetrasacharide linker via a xylosyl residue to a protein (forming a glycoprotein or proteoglycan).

Glycoscience
A 2012 report from the U.S. National Research Council calls for a new focus on glycoscience, a field that explores the structures and functions of glycans and promises great advances in areas as diverse as medicine, energy generation, and materials science. Until now, glycans have received little attention from the research community due to a lack of tools to probe their often complex structures and properties. The report presents a roadmap for transforming glycoscience from a field dominated by specialists to a widely studied and integrated discipline.

Glycans and lipids
See glycolipids

GPI-Anchors
See glycophosphatidylinositol

Tools used for glycan research
The following are examples of the commonly used techniques in glycan analysis:

High-resolution mass spectrometry (MS) and high-performance liquid chromatography (HPLC)
The most commonly applied methods are MS and HPLC, in which the glycan part is cleaved either enzymatically or chemically from the target and subjected to analysis. In case of glycolipids, they can be analyzed directly without separation of the lipid component.

N-glycans from glycoproteins are analyzed routinely by high-performance-liquid-chromatography (reversed phase, normal phase and ion exchange HPLC) after tagging the reducing end of the sugars with a fluorescent compound (reductive labeling). A large variety of different labels were introduced in the recent years, where 2-aminobenzamide (AB), anthranilic acid (AA), 2-aminopyridin (PA), 2-aminoacridone (AMAC) and 3-(acetylamino)-6-aminoacridine (AA-Ac) are just a few of them. Different labels have to be used for different ESI modes and MS systems used.

O-glycans are usually analysed without any tags, due to the chemical release conditions preventing them to be labeled.

Fractionated glycans from high-performance liquid chromatography (HPLC) instruments can be further analyzed by MALDI-TOF-MS(MS) to get further information about structure and purity. Sometimes glycan pools are analyzed directly by mass spectrometry without prefractionation, although a discrimination between isobaric glycan structures is more challenging or even not always possible. Anyway, direct MALDI-TOF-MS analysis can lead to a fast and straightforward illustration of the glycan pool.

In recent years, high performance liquid chromatography online coupled to mass spectrometry became very popular. By choosing porous graphitic carbon as a stationary phase for liquid chromatography, even non derivatized glycans can be analyzed. Detection is here done by mass spectrometry, but in instead of MALDI-MS, electrospray ionisation (ESI) is more frequently used.

Multiple reaction monitoring (MRM)
Although MRM has been used extensively in metabolomics and proteomics, its high sensitivity and linear response over a wide dynamic range make it especially suited for glycan biomarker research and discovery. MRM is performed on a triple quadrupole (QqQ) instrument, which is set to detect a predetermined precursor ion in the first quadrupole, a fragmented in the collision quadrupole, and a predetermined fragment ion in the third quadrupole. It is a non-scanning technique, wherein each transition is detected individually and the detection of multiple transitions occurs concurrently in duty cycles. This technique is being used to characterize the immune glycome.

Table 1:Advantages and disadvantages of mass spectrometry in glycan analysis

Arrays
Lectin and antibody arrays provide high-throughput screening of many samples containing glycans. This method uses either naturally occurring lectins or artificial monoclonal antibodies, where both are immobilized on a certain chip and incubated with a fluorescent glycoprotein sample.

Glycan arrays, like that offered by the Consortium for Functional Glycomics and Z Biotech LLC, contain carbohydrate compounds that can be screened with lectins or antibodies to define carbohydrate specificity and identify ligands.

Metabolic and covalent labeling of glycans
Metabolic labeling of glycans can be used as a way to detect glycan structures. A well-known strategy involves the use of azide-labeled sugars which can be reacted using the Staudinger ligation. This method has been used for in vitro and in vivo imaging of glycans.

Tools for glycoproteins
X-ray crystallography and nuclear magnetic resonance (NMR) spectroscopy for complete structural analysis of complex glycans is a difficult and complex field. However, the structure of the binding site of numerous lectins, enzymes and other carbohydrate-binding proteins have revealed a wide variety of the structural basis for glycome function. The purity of test samples have been obtained through chromatography (affinity chromatography etc.) and analytical electrophoresis (PAGE (polyacrylamide electrophoresis), capillary electrophoresis, affinity electrophoresis, etc.).

Resources

 * National Center for Functional Glycomics (NCFG) The focus of the NCFG is the development in the glycosciences, with an emphasis on exploring the molecular mechanisms of glycan recognition by proteins important in human biology and disease. They have a number of resources for glycan analysis as well as training in glycomics and protocols for glycan analysis
 * GlyTouCan, Glycan structure repository
 * Glycosciences.DE, German glycan database
 * Carbohydrate Structure Database, Russian glycan database
 * UniCarbKB, Australian glycan database
 * GlycoSuiteDB, glycan database by Swiss Institute of Bioinformatics
 * GlyGen, NIH funded glycoinformatics resource
 * The Consortium for Functional Glycomics (CFG) is a non-profit research initiative comprising eight core facilities and 500+ participating investigators that work together to develop resources and services and make them available to the scientific community free of charge. The data generated by these resources are captured in databases accessible through the Functional Glycomics Gateway, a web resource maintained through a partnership between the CFG and Nature Publishing Group.
 * Transforming Glycoscience: A Roadmap for the Future by the U.S. National Research Council. This site provides information about the U.S. National Research Council's reports and workshops on glycoscience.