O-linked glycosylation

O-linked glycosylation is the attachment of a sugar molecule to the oxygen atom of serine (Ser) or threonine (Thr) residues in a protein. O-glycosylation is a post-translational modification that occurs after the protein has been synthesised. In eukaryotes, it occurs in the endoplasmic reticulum, Golgi apparatus and occasionally in the cytoplasm; in prokaryotes, it occurs in the cytoplasm. Several different sugars can be added to the serine or threonine, and they affect the protein in different ways by changing protein stability and regulating protein activity. O-glycans, which are the sugars added to the serine or threonine, have numerous functions throughout the body, including trafficking of cells in the immune system, allowing recognition of foreign material, controlling cell metabolism and providing cartilage and tendon flexibility. Because of the many functions they have, changes in O-glycosylation are important in many diseases including cancer, diabetes and Alzheimer's. O-glycosylation occurs in all domains of life, including eukaryotes, archaea and a number of pathogenic bacteria including Burkholderia cenocepacia, Neisseria gonorrhoeae and Acinetobacter baumannii.

O-N-acetylgalactosamine (O-GalNAc)


Addition of N-acetylgalactosamine (GalNAc) to a serine or threonine occurs in the Golgi apparatus, after the protein has been folded. The process is performed by enzymes known as GalNAc transferases (GALNTs), of which there are 20 different types. The initial O-GalNAc structure can be modified by the addition of other sugars, or other compounds such as methyl and acetyl groups. These modifications produce 8 core structures known to date. Different cells have different enzymes that can add further sugars, known as glycosyltransferases, and structures therefore change from cell to cell. Common sugars added include galactose, N-acetylglucosamine, fucose and sialic acid. These sugars can also be modified by the addition of sulfates or acetyl groups.



Biosynthesis
GalNAc is added onto a serine or threonine residue from a precursor molecule, through the activity of a GalNAc transferase enzyme. This precursor is necessary so that the sugar can be transported to where it will be added to the protein. The specific residue onto which GalNAc will be attached is not defined, because there are numerous enzymes that can add the sugar and each one will favour different residues. However, there are often proline (Pro) residues near the threonine or serine.

Once this initial sugar has been added, other glycosyltransferases can catalyse the addition of additional sugars. Two of the most common structures formed are Core 1 and Core 2. Core 1 is formed by the addition of a galactose sugar onto the initial GalNAc. Core 2 consists of a Core 1 structure with an additional N-acetylglucosamine (GlcNAc) sugar. A poly-N-acetyllactosamine structure can be formed by the alternating addition of GlcNAc and galactose sugars onto the GalNAc sugar.

Terminal sugars on O-glycans are important in recognition by lectins and play a key role in the immune system. Addition of fucose sugars by fucosyltransferases forms Lewis epitopes and the scaffold for blood group determinants. Addition of a fucose alone creates the H-antigen, present in people with blood type O. By adding a galactose onto this structure, the B-antigen of blood group B is created. Alternatively, adding a GalNAc sugar will create the A-antigen for blood group A.



Functions
O-GalNAc sugars are important in a variety of processes, including leukocyte circulation during an immune response, fertilisation, and protection against invading microbes.

O-GalNAc sugars are common on membrane glycoproteins, where they help increase rigidity of the region close to the membrane so that the protein extends away from the surface. For example, the low-density lipoprotein receptor (LDL) is projected from the cell surface by a region rigidified by O-glycans.

In order for leukocytes of the immune system to move into infected cells, they have to interact with these cells through receptors. Leukocytes express ligands on their cell surface to allow this interaction to occur. P-selectin glycoprotein ligand-1 (PSGL-1) is such a ligand, and contains a lot of O-glycans that are necessary for its function. O-glycans near the membrane maintain the elongated structure and a terminal sLex epitope is necessary for interactions with the receptor.

Mucins are a group of heavily O-glycosylated proteins that line the gastrointestinal and respiratory tracts to protect these regions from infection. Mucins are negatively charged, which allows them to interact with water and prevent it from evaporating. This is important in their protective function as it lubricates the tracts so bacteria cannot bind and infect the body. Changes in mucins are important in numerous diseases, including cancer and inflammatory bowel disease. Absence of O-glycans on mucin proteins changes their 3D shape dramatically and often prevents correct function.

O-N-acetylglucosamine (O-GlcNAc)
Addition of N-acetylglucosamine (O-GlcNAc) to serine and threonine residues usually occurs on cytoplasmic and nuclear proteins that remain in the cell, compared to O-GalNAc modifications which usually occur on proteins that will be secreted. O-GlcNAc modifications were only recently discovered, but the number of proteins with known O-GlcNAc modifications is increasing rapidly. It is the first example of glycosylation that does not occur on secretory proteins.



O-GlcNAcylation differs from other O-glycosylation processes because there are usually no sugars added onto the core structure and because the sugar can be attached or removed from a protein several times. This addition and removal occurs in cycles and is performed by two very specific enzymes. O-GlcNAc is added by O-GlcNAc transferase (OGT) and removed by O-GlcNAcase (OGA). Because there are only two enzymes that affect this specific modification, they are very tightly regulated and depend on a lot of other factors.

Because O-GlcNAc can be added and removed, it is known as a dynamic modification and has a lot of similarities to phosphorylation. O-GlcNAcylation and phosphorylation can occur on the same threonine and serine residues, suggesting a complex relationship between these modifications that can affect many functions of the cell. The modification affects processes like the cells response to cellular stress, the cell cycle, protein stability and protein turnover. It may be implicated in neurodegenerative diseases like Parkinson's and late-onset Alzheimer's  and has been found to play a role in diabetes.

Additionally, O-GlcNAcylation can enhance the Warburg Effect, which is defined as the change that occurs in the metabolism of cancer cells to favour their growth. Because both O-GlcNAcylation and phosphorylation can affect specific residues and therefore both have important functions in regulating signalling pathways, both of these processes provide interesting targets for cancer therapy.

O-Mannose (O-Man)


O-mannosylation involves the transfer of a mannose from a dolichol-P-mannose donor molecule onto the serine or threonine residue of a protein. Most other O-glycosylation processes use a sugar nucleotide as a donor molecule. A further difference from other O-glycosylations is that the process is initiated in the endoplasmic reticulum of the cell, rather than the Golgi apparatus. However, further addition of sugars occurs in the Golgi.

Until recently, it was believed that the process is restricted to fungi, however it occurs in all domains of life; eukaryotes, (eu)bacteria and archae(bacteri)a. The best characterised O-mannosylated human protein is α-dystroglycan. O-Man sugars separate two domains of the protein, required to connect the extracellular and intracellular regions to anchor the cell in position. Ribitol, xylose and glucuronic acid can be added to this structure in a complex modification that forms a long sugar chain. This is required to stabilise the interaction between α-dystroglycan and the extracellular basement membrane. Without these modifications, the glycoprotein cannot anchor the cell which leads to congenital muscular dystrophy (CMD), characterised by severe brain malformations.

O-Galactose (O-Gal)
O-galactose is commonly found on lysine residues in collagen, which often have a hydroxyl group added to form hydroxylysine. Because of this addition of an oxygen, hydroxylysine can then be modified by O-glycosylation. Addition of a galactose to the hydroxyl group is initiated in the endoplasmic reticulum, but occurs predominantly in the Golgi apparatus and only on hydroxylysine residues in a specific sequence.

While this O-galactosylation is necessary for correct function in all collagens, it is especially common in collagen types IV and V. In some cases, a glucose sugar can be added to the core galactose.

O-Fucose (O-Fuc)
Addition of fucose sugars to serine and threonine residues is an unusual form of O-glycosylation that occurs in the endoplasmic reticulum and is catalysed by two fucosyltransferases. These were discovered in Plasmodium falciparum and Toxoplasma gondii.

Several different enzymes catalyse the elongation of the core fucose, meaning that different sugars can be added to the initial fucose on the protein. Along with O-glucosylation, O-fucosylation is mainly found on epidermal growth factor (EGF) domains found in proteins. O-fucosylation on EGF domains occurs between the second and third conserved cysteine residues in the protein sequence. Once the core O-fucose has been added, it is often elongated by addition of GlcNAc, galactose and sialic acid.

Notch is an important protein in development, with several EGF domains that are O-fucosylated. Changes in the elaboration of the core fucose determine what interactions the protein can form, and therefore which genes will be transcribed during development. O-fucosylation might also play a role in protein breakdown in the liver.

O-Glucose (O-Glc)
Similarly to O-fucosylation, O-glucosylation is an unusual O-linked modification as it occurs in the endoplasmic reticulum, catalysed by O-glucosyltransferases, and also requires a defined sequence in order to be added to the protein. O-glucose is often attached to serine residues between the first and second conserved cysteine residues of EGF domains, for example in clotting factors VII and IX. O-glucosylation also appears to be necessary for the proper folding of EGF domains in the Notch protein.

Proteoglycans


Proteoglycans consist of a protein with one or more sugar side chains, known as glycosaminoglycans (GAGs), attached to the oxygen of serine and threonine residues. GAGs consist of long chains of repeating sugar units. Proteoglycans are usually found on the cell surface and in the extracellular matrix (ECM), and are important for the strength and flexibility of cartilage and tendons. Absence of proteoglycans is associated with heart and respiratory failure, defects in skeletal development and increased tumor metastasis.

Different types of proteoglycans exist, depending on the sugar that is linked to the oxygen atom of the residue in the protein. For example, the GAG heparan sulphate is attached to a protein serine residue through a xylose sugar. The structure is extended with several N-acetyllactosamine repeating sugar units added onto the xylose. This process is unusual and requires specific xylosyltransferases. Keratan sulphate attaches to a serine or threonine residue through GalNAc, and is extended with two galactose sugars, followed by repeating units of glucuronic acid (GlcA) and GlcNAc. Type II keratan sulphate is especially common in cartilage.

Lipids


Galactose or glucose sugars can be attached to a hydroxyl group of ceramide lipids in a different form of O-glycosylation, as it does not occur on proteins. This forms glycosphingolipids, which are important for the localisation of receptors in membranes. Incorrect breakdown of these lipids leads to a group of diseases known as sphingolipidoses, which are often characterised by neurodegeneration and developmental disabilities.

Because both galactose and glucose sugars can be added to the ceramide lipid, we have two groups of glycosphingolipids. Galactosphingolipids are generally very simple in structure and the core galactose is not usually modified. Glucosphingolipids, however, are often modified and can become a lot more complex.

Biosynthesis of galacto- and glucosphingolipids occurs differently. Glucose is added onto ceramide from its precursor in the endoplasmic reticulum, before further modifications occur in the Golgi apparatus. Galactose, on the other hand, is added to ceramide already in the Golgi apparatus, where the galactosphingolipid formed is often sulfated by addition of sulfate groups.

Glycogenin
One of the first and only examples of O-glycosylation on tyrosine, rather than on serine or threonine residues, is the addition of glucose to a tyrosine residue in glycogenin. Glycogenin is a glycosyltransferase that initiates the conversion of glucose to glycogen, present in muscle and liver cells.

Clinical significance
All forms of O-glycosylation are abundant throughout the body and play important roles in many cellular functions.

Lewis epitopes are important in determining blood groups, and allow the generation of an immune response if we detect foreign organs. Understanding them is important in organ transplants.

Hinge regions of immunoglobulins contain highly O-glycosylated regions between individual domains to maintain their structure, allow interactions with foreign antigens and protect the region from proteolytic cleavage.

Alzheimer's may be affected by O-glycosylation. Tau, the protein that accumulates to cause neurodegeneration in Alzheimer's, contains O-GlcNAc modifications which may be implicated in disease progression.

Changes in O-glycosylation are extremely common in cancer. O-glycan structures, and especially the terminal Lewis epitopes, are important in allowing tumor cells to invade new tissues during metastasis. Understanding these changes in O-glycosylation of cancer cells can lead to new diagnostic approaches and therapeutic opportunities.