Siglec

Siglecs (Sialic acid-binding immunoglobulin-type lectins) are cell surface proteins that bind sialic acid. They are found primarily on the surface of immune cells and are a subset of the I-type lectins. There are 14 different mammalian Siglecs, providing an array of different functions based on cell surface receptor-ligand interactions.

History
The first described candidate Siglec was Sialoadhesin (Siglec-1/CD169) a lectin-like adhesion protein on macrophages. Parallel studies by Ajit Varki and colleagues on the previously cloned CD22 (a B cell surface protein involved in adhesion and activation) showed direct evidence for sialic acid recognition. The subsequent cloning of Sialoadhesin by Crocker revealed homology to CD22 (Siglec-2), CD33 (Siglec-3) and myelin-associated glycoprotein (MAG/Siglec-4),  leading to the proposal for a family of "Sialoadhesins". Varki then suggested the term Siglec as a better alternative and as a subset of I-type (Ig-type) lectins. This nomenclature was agreed upon and has been adopted by almost all investigators working on these molecules (by convention, Siglecs are always capitalised.) Several additional Siglecs (Siglecs 5–12) have been identified in humans that are highly similar in structure to CD33 and so are collectively referred to as "CD33-related Siglecs". Further Siglecs have been identified including Siglec-14 and Siglec-15. Siglecs have been characterized into two distinct groups: the first and highly conserved-across-mammals group composed of Sialoadhesins, CD22, MAG, and Siglec-15, and a second group comprising Siglecs closely related to CD33. Others such as Siglec-8 and Siglec-9 have homologues in mice and rats (Siglec-F and Siglec-E respectively in both). Humans have a higher number of Siglecs than mice and so the numbering system was based on the human proteins.

Structure
Siglecs are Type I transmembrane proteins where the NH3+-terminus is in the extracellular space and the COO−-terminus is cytosolic. Each Siglec contains an N-terminal V-type immunoglobulin domain (Ig domain) which acts as the binding receptor for sialic acid. These lectins are placed into the group of I-type lectins because the lectin domain is an immunoglobulin fold. All Siglecs are extended from the cell surface by C2-type Ig domains which have no binding activity. Siglecs differ in the number of these C2-type domains. As these proteins contain Ig domains, they are members of the Immunoglobulin superfamily (IgSF).

Most Siglecs, such as CD22 and the CD33-related family, contain ITIMs (Immunoreceptor tyrosine-based inhibitory motifs) in their cytosolic region. These act to down-regulate signaling pathways involving phosphorylation, such as those induced by ITAMs (Immunoreceptor tyrosine-based activation motifs). Some, however, like Siglec-14, contain positive amino acid residues that help dock ITAM-containing adaptor proteins such as DAP12.

Ligand binding
Due to the acidic nature of sialic acid, Siglec active sites contain a conserved arginine residue which is positively charged at physiological pH. This amino acid forms salt bridges with the carboxyl group of the sugar residue. This is best seen in Sialoadhesin, where arginine at position 97 forms salt bridges with the COO− group of the sialic acid, producing a stable interaction. Each lectin domain is specific for the linkage that connects sialic acid to the glycan. Sialic acid contains numerous hydroxyl groups which can be involved in the formation of glycosidic linkages, which are observed at carbons number 2, 3, 6, and 8 of the sugar backbone. The binding specificity of each Siglec is due to different chemical interactions between the sugar ligand and the Siglec amino acids. The position in space of the individual groups on the sugar and the protein amino acids affects the sialic acid linkage to which each Siglec binds. For example, Sialoadhesin preferentially binds α2,3 linkages over α2,6 linkages.

Function
The primary function of Siglecs is to bind glycans containing sialic acids. These receptor-glycan interactions can be used in cell adhesion, cell signalling and others. The function of Siglecs is limited to their cellular distribution. For example, MAG is found only on oligodendrocytes and schwann cells whereas Sialoadhesin is localised to macrophages.

Most Siglecs are short and do not extend far from the cell surface. This prevents most Siglecs from binding to other cells as mammalian cells are covered in sialic acid-containing glycans. This means that the majority of Siglecs only bind ligands on the surface of the same cell, so called cis -ligands, as they are "swamped" by glycans on the same cell. One exception is Sialoadhesin which contains 16 C2-Ig domains, producing a long, extended protein allowing it to bind trans-ligands, i.e. ligands found on other cells. Others, such as MAG, have also been shown to bind trans-ligands.

Signalling
The members of the siglec family are paired receptors with opposing intracellular signaling functions. Due to their ITIM-containing cytoplasmic regions, most Siglecs interfere with cellular signalling, inhibiting immune cell activation. Once bound to their ligands, Siglecs recruit inhibitory proteins such as SHP phosphatases via their ITIM domains. The tyrosine contained within the ITIM is phosphorylated after ligand binding and acts as a docking site for SH2 domain-containing proteins like SHP phosphatases. This leads to de-phosphorylation of cellular proteins, down-regulating activating signalling pathways.

Examples of negative signalling:
 * CD22 is found on B cells. B cells become active when the B-cell receptor (BCR) binds to its cognate ligand. Once the BCR is bound to its ligand, the receptor auto-phosphorylates its cytoplasmic region (cytoplasmic tail).  This leads to phosphorylation of the three ITIMs in CD22's cytoplasmic tail, leading to the recruitment of SHP-1 which negatively regulates BCR-based cellular activation.  This creates an activation threshold for B cell activation whereby transient activation of  B cells is prevented.  CD22 inhibition of BCR signalling was originally thought to be sialic acid-binding-independent, but evidence suggests α2,6 sialic acid ligands are required for inhibition.
 * Siglec-7 is found on Natural Killer cells (NK cells). Siglec-7 leads to cellular inactivation once bound to its sialic acid-containing cognate ligand and is found in high levels on NK cell surfaces. It is used in cell-cell contacts, binding to sialylated glycans on target cells leading to inhibition of NK cell-dependent killing of the target cell. Mammalian cells contain high levels of sialic acid and so when NK cells bind so called "self-cells", they are not activated and do not kill host cells.

Siglec-14 contains an arginine residue in its transmembrane region. This binds to the ITAM-containing DAP10 and DAP12 proteins. When bound to its ligand, Siglec-14 leads to activation of cellular signalling pathways via the DAP10 and DAP12 proteins. These proteins up-regulate phosphorylation cascades involving numerous cellular proteins, leading to cellular activation. Siglec-14 appears to co-localise with Siglec-5, and as this protein inhibits cellular signalling pathways, co-ordinate opposing functions within immune cells.

Phagocytosis and adhesion
Siglecs that can bind trans-ligands, such as Sialoadhesin, allow cell-cell interactions to take place. These glycan-Siglec interactions allow cells to bind one another, allowing signalling in some cases, or in the case of Sialoadhesin, pathogen uptake. Sialoadhesin's function was originally thought to be important in binding to red blood cells. Sialoadhesin lacks a cytosolic ITIM or a positive residue to bind ITAM-containing adaptors and so is thought not to influence signalling. Studies show that this protein is involved in phagocytosis of bacteria that contain highly sialylated glycan structures such as the lipopolysaccharide of Neisseria meningitidis. Binding to these structures allows the macrophage to phagocytose these bacteria, clearing the system of pathogens.

Siglec-7 is also used in binding to pathogens such as Campylobacter jejuni. This occurs in a sialic acid-dependent manner and brings NK cells and monocytes, on which Siglec-7 is expressed, into contact with these bacteria. The NK cell is then able to kill these foreign pathogens.

Knock-out studies
Knock-out studies are often used to uncover the function proteins have within a cell. Mice are often used as they express orthologous proteins of ours, or extremely similar homologues.

Some examples of knock-out Siglecs include:


 * CD22: Walker & Smith conducted experiments with CD22 knock-outs and deletion mutants to discern CD22's function. These mutant B cells did not infer any autoimmune disease, but they did see an increased production of autoantibodies due to the lack of BCR signalling inhibition, usually conducted by CD22.  Autoantibodies are specific for self proteins and can lead to harm in the host. CD22 is normally up-regulated by lipopolysaccharide binding to Toll-like receptors.  The mutant B cells can not up-regulate the mutant protein and so become hyper-sensitive in the presence of lipopolysaccharide.  This means that the B cells overproduce antibodies when antibodies would not normally have been produced.


 * MAG (Myelin-associated glycoprotein) is expressed on cells that form myelin sheaths (schwann cells and oligodendrocytes) around neurons.  MAG binds to sialylated ligands on the neuron. Knock-out of MAG in the peripheral nervous system leads to decreased myelination of neurons.  Knock-out of MAG in the central nervous system of mice does not appear to affect myelination, but the interaction between the myelin and the neuron does deteriorate with age.   This leads to neurological defects as the action potential can not pass so rapidly down the length of the axon during neural stimulation.  Removing the ligand for MAG, by knocking-out the GalNAc transferase gene required for ligand formation, has similar effects to that of the MAG knock-out mice

Human/Primate Siglecs
This table briefly summarises the cellular distribution of each human/primate Siglec; the linkage specificity each has for sialic acid binding; the number of C2-Ig domains it contains; and whether it contains an ITIM or a positive residue to bind ITAM-containing adaptor proteins. References in the column headings correspond to all information displayed in that column, unless other references are shown. Siglec-12 information is referenced by only, excluding the linkage specificity.

Mimetics
Many pathologies have been linked to the spontaneous interactions between sialic acid and the immunosuppressive sialic acid-binding immunoglobulin-like lectin (Siglec) receptors on immune cells such as cancer, HIV-1 and Group B Strep Infection. The sialic acid family branches from glycans, sugar chains comprising various monosaccharides that cover the membrane of every living cell and display a staggering structural diversity. Sialic acids function in protein folding, neural development, cellular interactions, among many other physiological processes. As sialic acids are abundantly expressed in vertebrates and not in microorganisms, they are considered self-antigens or self-structures that play major role in inhibiting harmful immune system activity by regulating neutrophils and B cell tolerance.

Within the immune system, Siglecs, especially those related to CD33, sialic acid and Siglec-binding pathogens are subjected to the runaway Red Queen co-evolution phenomenon by a selection pressure that maintains innate immune system's capacity for self-recognition and ensures prevention of autoimmunity diseases. This evolutionary chain and incessant mutations have made Siglecs one of the most rapidly evolving gene, evidenced by both intra- and inter-species differences. The polymorphism of human-unique Siglec-12, -14 and -16 suggests that the selection pressure is ongoing.

As Siglecs feature distinct binding preferences for the sialic acid and its modifications, several attempts have been made to chemically modify natural sialic acid ligands and eventually led to the creation of sialic acid mimetics (SAMs) with enhanced binding capacity and selectivity towards Siglecs.

Synthesis
SAMs can be used to target Siglecs and modulate Siglec-expressing cells by modifying the sialic acid backbone at various positions, from C-2 to C-9. The carboxylic acid, however, must be left intact. The first attempts were made to develop high-affinity sialic acid mimetics for Siglec-2, which led to the discovery that increased binding affinity came hydrogen bonding and lipophilic interactions between SAMs and Siglec-2. Several separate modifications have been made at the C-2, C-5 and C-9 positions, leading Mesch et al. to hypothesizing that the simultaneous modification at all three positions could lead to optimization of binding.

Success in drastically enhanced binding of SAMs to Siglec 2 suggests that a similar approach can work on other members of the family. Some modifications have included an additional simultaneous modification at the C-4 position on the sialic acid backbone. The development of (copper) I-catalyzed azide alkyne cycloaddition (CuAAC) click chemistry has expedited the identification of new SAMs and allowed for the creation of novel SAMs with high binding to Siglec-3, -5, -6, -7 and -10. As of 2017, SAMs for most Siglecs have been reported, except for Siglec -6, -8, -11, -14, -15 and -16.

Clustering of receptors and high-avidity binding, collectively known as multivalent binding, can enhance the effectiveness of SAMs in human body. Currently, advancements in glycoengineering have made use of SAM-decorated nanoparticles, SAM-decorated polymers and on-cell synthesis of SAMs to present SAMs to Siglecs. Liposomes crosslinked with SAMs also have been shown to aid in presenting antigens to antigen-presenting cells via the Siglec-1 or -7 pathways. Moreover, human cells, engineered with sialic acids carrying Ac5NeuNPoc incorporated into its sialoglycans and 3-bromo-benzyl azide, showed hyperactivity towards Siglec-2.