Ubiquitin

Ubiquitin is a small (8.6 kDa) regulatory protein found in most tissues of eukaryotic organisms, i.e., it is found ubiquitously. It was discovered in 1975 by Gideon Goldstein and further characterized throughout the late 1970s and 1980s. Four genes in the human genome code for ubiquitin: UBB, UBC, UBA52 and RPS27A.

The addition of ubiquitin to a substrate protein is called ubiquitylation (or ubiquitination or ubiquitinylation). Ubiquitylation affects proteins in many ways: it can mark them for degradation via the proteasome, alter their cellular location, affect their activity, and promote or prevent protein interactions. Ubiquitylation involves three main steps: activation, conjugation, and ligation, performed by ubiquitin-activating enzymes (E1s), ubiquitin-conjugating enzymes (E2s), and ubiquitin ligases (E3s), respectively. The result of this sequential cascade is to bind ubiquitin to lysine residues on the protein substrate via an isopeptide bond, cysteine residues through a thioester bond, serine and threonine residues through an ester bond, or the amino group of the protein's N-terminus via a peptide bond.

The protein modifications can be either a single ubiquitin protein (monoubiquitylation) or a chain of ubiquitin (polyubiquitylation). Secondary ubiquitin molecules are always linked to one of the seven lysine residues or the N-terminal methionine of the previous ubiquitin molecule. These 'linking' residues are represented by a "K" or "M" (the one-letter amino acid notation of lysine and methionine, respectively) and a number, referring to its position in the ubiquitin molecule as in K48, K29 or M1. The first ubiquitin molecule is covalently bound through its C-terminal carboxylate group to a particular lysine, cysteine, serine, threonine or N-terminus of the target protein. Polyubiquitylation occurs when the C-terminus of another ubiquitin is linked to one of the seven lysine residues or the first methionine on the previously added ubiquitin molecule, creating a chain. This process repeats several times, leading to the addition of several ubiquitins. Only polyubiquitylation on defined lysines, mostly on K48 and K29, is related to degradation by the proteasome (referred to as the "molecular kiss of death"), while other polyubiquitylations (e.g. on K63, K11, K6 and M1) and monoubiquitylations may regulate processes such as endocytic trafficking, inflammation, translation and DNA repair.

The discovery that ubiquitin chains target proteins to the proteasome, which degrades and recycles proteins, was honored with the Nobel Prize in Chemistry in 2004.

Identification
Ubiquitin (originally, ubiquitous immunopoietic polypeptide) was first identified in 1975 as an 8.6 kDa protein expressed in all eukaryotic cells. The basic functions of ubiquitin and the components of the ubiquitylation pathway were elucidated in the early 1980s at the Technion by Aaron Ciechanover, Avram Hershko, and Irwin Rose for which the Nobel Prize in Chemistry was awarded in 2004.

The ubiquitylation system was initially characterised as an ATP-dependent proteolytic system present in cellular extracts. A heat-stable polypeptide present in these extracts, ATP-dependent proteolysis factor 1 (APF-1), was found to become covalently attached to the model protein substrate lysozyme in an ATP- and Mg2+-dependent process. Multiple APF-1 molecules were linked to a single substrate molecule by an isopeptide linkage, and conjugates were found to be rapidly degraded with the release of free APF-1. Soon after APF-1-protein conjugation was characterised, APF-1 was identified as ubiquitin. The carboxyl group of the C-terminal glycine residue of ubiquitin (Gly76) was identified as the moiety conjugated to substrate lysine residues.

The protein
Ubiquitin is a small protein that exists in all eukaryotic cells. It performs its myriad functions through conjugation to a large range of target proteins. A variety of different modifications can occur. The ubiquitin protein itself consists of 76 amino acids and has a molecular mass of about 8.6 kDa. Key features include its C-terminal tail and the 7 lysine residues. It is highly conserved throughout eukaryote evolution; human and yeast ubiquitin share 96% sequence identity.

Genes
Ubiquitin is encoded in mammals by 4 different genes. UBA52 and RPS27A genes code for a single copy of ubiquitin fused to the ribosomal proteins L40 and S27a, respectively. The UBB and UBC genes code for polyubiquitin precursor proteins.

Ubiquitylation


Ubiquitylation (also known as ubiquitination or ubiquitinylation) is an enzymatic post-translational modification in which an ubiquitin protein is attached to a substrate protein. This process most commonly binds the last amino acid of ubiquitin (glycine 76) to a lysine residue on the substrate. An isopeptide bond is formed between the carboxyl group (COO−) of the ubiquitin's glycine and the epsilon-amino group (ε-) of the substrate's lysine. Trypsin cleavage of a ubiquitin-conjugated substrate leaves a di-glycine "remnant" that is used to identify the site of ubiquitylation. Ubiquitin can also be bound to other sites in a protein which are electron-rich nucleophiles, termed "non-canonical ubiquitylation". This was first observed with the amine group of a protein's N-terminus being used for ubiquitylation, rather than a lysine residue, in the protein MyoD and has been observed since in 22 other proteins in multiple species,               including ubiquitin itself. There is also increasing evidence for nonlysine residues as ubiquitylation targets using non-amine groups, such as the sulfhydryl group on cysteine,      and the hydroxyl group on threonine and serine. The end result of this process is the addition of one ubiquitin molecule (monoubiquitylation) or a chain of ubiquitin molecules (polyubiquitination) to the substrate protein.

Ubiquitination requires three types of enzyme: ubiquitin-activating enzymes, ubiquitin-conjugating enzymes, and ubiquitin ligases, known as E1s, E2s, and E3s, respectively. The process consists of three main steps:
 * 1) Activation: Ubiquitin is activated in a two-step reaction by an E1 ubiquitin-activating enzyme, which is dependent on ATP. The initial step involves production of a ubiquitin-adenylate intermediate. The E1 binds both ATP and ubiquitin and catalyses the acyl-adenylation of the C-terminus of the ubiquitin molecule. The second step transfers ubiquitin to an active site cysteine residue, with release of AMP. This step results in a thioester linkage between the C-terminal carboxyl group of ubiquitin and the E1 cysteine sulfhydryl group. The human genome contains two genes that produce enzymes capable of activating ubiquitin: UBA1 and UBA6.
 * 2) Conjugation: E2 ubiquitin-conjugating enzymes catalyse the transfer of ubiquitin from E1 to the active site cysteine of the E2 via a trans(thio)esterification reaction. In order to perform this reaction, the E2 binds to both activated ubiquitin and the E1 enzyme. Humans possess 35 different E2 enzymes, whereas other eukaryotic organisms have between 16 and 35. They are characterised by their highly conserved structure, known as the ubiquitin-conjugating catalytic (UBC) fold. Glycine lysine isopeptide v2.svg
 * 3) Ligation: E3 ubiquitin ligases catalyse the final step of the ubiquitylation cascade. Most commonly, they create an isopeptide bond between a lysine of the target protein and the C-terminal glycine of ubiquitin. In general, this step requires the activity of one of the hundreds of E3s. E3 enzymes function as the substrate recognition modules of the system and are capable of interaction with both E2 and substrate. Some E3 enzymes also activate the E2 enzymes. E3 enzymes possess one of two domains: the homologous to the E6-AP carboxyl terminus (HECT) domain and the really interesting new gene (RING) domain (or the closely related U-box domain). HECT domain E3s transiently bind ubiquitin in this process (an obligate thioester intermediate is formed with the active-site cysteine of the E3), whereas RING domain E3s catalyse the direct transfer from the E2 enzyme to the substrate. The anaphase-promoting complex (APC) and the SCF complex (for Skp1-Cullin-F-box protein complex) are two examples of multi-subunit E3s involved in recognition and ubiquitylation of specific target proteins for degradation by the proteasome.

In the ubiquitylation cascade, E1 can bind with many E2s, which can bind with hundreds of E3s in a hierarchical way. Having levels within the cascade allows tight regulation of the ubiquitylation machinery. Other ubiquitin-like proteins (UBLs) are also modified via the E1–E2–E3 cascade, although variations in these systems do exist.

E4 enzymes, or ubiquitin-chain elongation factors, are capable of adding pre-formed polyubiquitin chains to substrate proteins. For example, multiple monoubiquitylation of the tumor suppressor p53 by Mdm2 can be followed by addition of a polyubiquitin chain using p300 and CBP.

Types
Ubiquitylation affects cellular process by regulating the degradation of proteins (via the proteasome and lysosome), coordinating the cellular localization of proteins, activating and inactivating proteins, and modulating protein–protein interactions. These effects are mediated by different types of substrate ubiquitylation, for example the addition of a single ubiquitin molecule (monoubiquitylation) or different types of ubiquitin chains (polyubiquitylation).

Monoubiquitylation
Monoubiquitylation is the addition of one ubiquitin molecule to one substrate protein residue. Multi-monoubiquitylation is the addition of one ubiquitin molecule to multiple substrate residues. The monoubiquitylation of a protein can have different effects to the polyubiquitylation of the same protein. The addition of a single ubiquitin molecule is thought to be required prior to the formation of polyubiquitin chains. Monoubiquitylation affects cellular processes such as membrane trafficking, endocytosis and viral budding.

Polyubiquitin chains
Polyubiquitylation is the formation of a ubiquitin chain on a single lysine residue on the substrate protein. Following addition of a single ubiquitin moiety to a protein substrate, further ubiquitin molecules can be added to the first, yielding a polyubiquitin chain. These chains are made by linking the glycine residue of a ubiquitin molecule to a lysine of ubiquitin bound to a substrate. Ubiquitin has seven lysine residues and an N-terminus that serves as points of ubiquitination; they are K6, K11, K27, K29, K33, K48, K63 and M1, respectively. Lysine 48-linked chains were the first identified and are the best-characterised type of ubiquitin chain. K63 chains have also been well-characterised, whereas the function of other lysine chains, mixed chains, branched chains, M1-linked linear chains, and heterologous chains (mixtures of ubiquitin and other ubiquitin-like proteins) remains more unclear.

Lysine 48-linked polyubiquitin chains target proteins for destruction, by a process known as proteolysis. Multi-ubiquitin chains at least four ubiquitin molecules long must be attached to a lysine residue on the condemned protein in order for it to be recognised by the 26S proteasome. This is a barrel-shape structure comprising a central proteolytic core made of four ring structures, flanked by two cylinders that selectively allow entry of ubiquitylated proteins. Once inside, the proteins are rapidly degraded into small peptides (usually 3–25 amino acid residues in length). Ubiquitin molecules are cleaved off the protein immediately prior to destruction and are recycled for further use. Although the majority of protein substrates are ubiquitylated, there are examples of non-ubiquitylated proteins targeted to the proteasome. The polyubiquitin chains are recognised by a subunit of the proteasome: S5a/Rpn10. This is achieved by a ubiquitin-interacting motif (UIM) found in a hydrophobic patch in the C-terminal region of the S5a/Rpn10 unit.

Lysine 63-linked chains are not associated with proteasomal degradation of the substrate protein. Instead, they allow the coordination of other processes such as endocytic trafficking, inflammation, translation, and DNA repair. In cells, lysine 63-linked chains are bound by the ESCRT-0 complex, which prevents their binding to the proteasome. This complex contains two proteins, Hrs and STAM1, that contain a UIM, which allows it to bind to lysine 63-linked chains.

Methionine 1-linked (or linear) polyubiquitin chains are another type of non-degradative ubiquitin chains. In this case, ubiquitin is linked in a head-to-tail manner, meaning that the C-terminus of the last ubiquitin molecule binds directly to the N-terminus of the next one. Although initially believed to target proteins for proteasomal degradation, linear ubiquitin later proved to be indispensable for NF-kB signaling. Currently, there is only one known E3 ubiquitin ligase generating M1-linked polyubiquitin chains - linear ubiquitin chain assembly complex (LUBAC).

Less is understood about atypical (non-lysine 48-linked) ubiquitin chains but research is starting to suggest roles for these chains. There is evidence that atypical chains linked by lysine 6, 11, 27, 29 and methionine 1 can induce proteasomal degradation.

Branched ubiquitin chains containing multiple linkage types can be formed. The function of these chains is unknown.

Structure
Differently linked chains have specific effects on the protein to which they are attached, caused by differences in the conformations of the protein chains. K29-, K33-, K63- and M1-linked chains have a fairly linear conformation; they are known as open-conformation chains. K6-, K11-, and K48-linked chains form closed conformations. The ubiquitin molecules in open-conformation chains do not interact with each other, except for the covalent isopeptide bonds linking them together. In contrast, the closed conformation chains have interfaces with interacting residues. Altering the chain conformations exposes and conceals different parts of the ubiquitin protein, and the different linkages are recognized by proteins that are specific for the unique topologies that are intrinsic to the linkage. Proteins can specifically bind to ubiquitin via ubiquitin-binding domains (UBDs). The distances between individual ubiquitin units in chains differ between lysine 63- and 48-linked chains. The UBDs exploit this by having small spacers between ubiquitin-interacting motifs that bind lysine 48-linked chains (compact ubiquitin chains) and larger spacers for lysine 63-linked chains. The machinery involved in recognising polyubiquitin chains can also differentiate between K63-linked chains and M1-linked chains, demonstrated by the fact that the latter can induce proteasomal degradation of the substrate.

Function
The ubiquitylation system functions in a wide variety of cellular processes, including:
 * Antigen processing
 * Apoptosis
 * Biogenesis of organelles
 * Cell cycle and division
 * DNA transcription and repair
 * Differentiation and development
 * Immune response and inflammation
 * Neural and muscular degeneration
 * Maintenance of pluripotency
 * Morphogenesis of neural networks
 * Modulation of cell surface receptors, ion channels and the secretory pathway
 * Response to stress and extracellular modulators
 * Ribosome biogenesis
 * Viral infection

Membrane proteins
Multi-monoubiquitylation can mark transmembrane proteins (for example, receptors) for removal from membranes (internalisation) and fulfil several signalling roles within the cell. When cell-surface transmembrane molecules are tagged with ubiquitin, the subcellular localization of the protein is altered, often targeting the protein for destruction in lysosomes. This serves as a negative feedback mechanism, because often the stimulation of receptors by ligands increases their rate of ubiquitylation and internalisation. Like monoubiquitylation, lysine 63-linked polyubiquitin chains also has a role in the trafficking some membrane proteins.

Genomic maintenance
Proliferating cell nuclear antigen (PCNA) is a protein involved in DNA synthesis. Under normal physiological conditions PCNA is sumoylated (a similar post-translational modification to ubiquitylation). When DNA is damaged by ultra-violet radiation or chemicals, the SUMO molecule that is attached to a lysine residue is replaced by ubiquitin. Monoubiquitylated PCNA recruits polymerases that can carry out DNA synthesis with damaged DNA; but this is very error-prone, possibly resulting in the synthesis of mutated DNA. Lysine 63-linked polyubiquitylation of PCNA allows it to perform a less error-prone mutation bypass known by the template switching pathway.

Ubiquitylation of histone H2AX is involved in DNA damage recognition of DNA double-strand breaks. Lysine 63-linked polyubiquitin chains are formed on H2AX histone by the E2/E3 ligase pair, Ubc13-Mms2/RNF168. This K63 chain appears to recruit RAP80, which contains a UIM, and RAP80 then helps localize BRCA1. This pathway will eventually recruit the necessary proteins for homologous recombination repair.

Transcriptional regulation
Histones can be ubiquitinated, usually in the form of monoubiquitylation, although polyubiquitylated forms do occur. Histone ubiquitylation alters chromatin structure and allows the access of enzymes involved in transcription. Ubiquitin on histones also acts as a binding site for proteins that either activate or inhibit transcription and also can induce further post-translational modifications of the protein. These effects can all modulate the transcription of genes.

Deubiquitination
Deubiquitinating enzymes (deubiquitinases; DUBs) oppose the role of ubiquitylation by removing ubiquitin from substrate proteins. They are cysteine proteases that cleave the amide bond between the two proteins. They are highly specific, as are the E3 ligases that attach the ubiquitin, with only a few substrates per enzyme. They can cleave both isopeptide (between ubiquitin and lysine) and peptide bonds (between ubiquitin and the N-terminus). In addition to removing ubiquitin from substrate proteins, DUBs have many other roles within the cell. Ubiquitin is either expressed as multiple copies joined in a chain (polyubiquitin) or attached to ribosomal subunits. DUBs cleave these proteins to produce active ubiquitin. They also recycle ubiquitin that has been bound to small nucleophilic molecules during the ubiquitylation process. Monoubiquitin is formed by DUBs that cleave ubiquitin from free polyubiquitin chains that have been previously removed from proteins.

Ubiquitin-binding domains
Ubiquitin-binding domains (UBDs) are modular protein domains that non-covalently bind to ubiquitin, these motifs control various cellular events. Detailed molecular structures are known for a number of UBDs, binding specificity determines their mechanism of action and regulation, and how it regulates cellular proteins and processes.

Pathogenesis
The ubiquitin pathway has been implicated in the pathogenesis of a wide range of diseases and disorders, including:


 * Neurodegeneration
 * Infection and immunity
 * Genetic disorders
 * Cancer

Neurodegeneration
Ubiquitin is implicated in neurodegenerative diseases associated with proteostasis dysfunction, including Alzheimer's disease, motor neuron disease, Huntington's disease and Parkinson's disease. Transcript variants encoding different isoforms of ubiquilin-1 are found in lesions associated with Alzheimer's and Parkinson's disease. Higher levels of ubiquilin in the brain have been shown to decrease malformation of amyloid precursor protein (APP), which plays a key role in triggering Alzheimer's disease. Conversely, lower levels of ubiquilin-1 in the brain have been associated with increased malformation of APP. A frameshift mutation in ubiquitin B can result in a truncated peptide missing the C-terminal glycine. This abnormal peptide, known as UBB+1, has been shown to accumulate selectively in Alzheimer's disease and other tauopathies.

Infection and immunity
Ubiquitin and ubiquitin-like molecules extensively regulate immune signal transduction pathways at virtually all stages, including steady-state repression, activation during infection, and attenuation upon clearance. Without this regulation, immune activation against pathogens may be defective, resulting in chronic disease or death. Alternatively, the immune system may become hyperactivated and organs and tissues may be subjected to autoimmune damage.

On the other hand, viruses must block or redirect host cell processes including immunity to effectively replicate, yet many viruses relevant to disease have informationally limited genomes. Because of its very large number of roles in the cell, manipulating the ubiquitin system represents an efficient way for such viruses to block, subvert or redirect critical host cell processes to support their own replication.

The retinoic acid-inducible gene I (RIG-I) protein is a primary immune system sensor for viral and other invasive RNA in human cells. The RIG-I-like receptor (RLR) immune signaling pathway is one of the most extensively studied in terms of the role of ubiquitin in immune regulation.

Genetic disorders

 * Angelman syndrome is caused by a disruption of UBE3A, which encodes a ubiquitin ligase (E3) enzyme termed E6-AP.
 * Von Hippel–Lindau syndrome involves disruption of a ubiquitin E3 ligase termed the VHL tumor suppressor, or VHL gene.
 * Fanconi anemia: Eight of the thirteen identified genes whose disruption can cause this disease encode proteins that form a large ubiquitin ligase (E3) complex.
 * 3-M syndrome is an autosomal-recessive growth retardation disorder associated with mutations of the Cullin7 E3 ubiquitin ligase.

Diagnostic use
Immunohistochemistry using antibodies to ubiquitin can identify abnormal accumulations of this protein inside cells, indicating a disease process. These protein accumulations are referred to as inclusion bodies (which is a general term for any microscopically visible collection of abnormal material in a cell). Examples include:
 * Neurofibrillary tangles in Alzheimer's disease
 * Lewy body in Parkinson's disease
 * Pick bodies in Pick's disease
 * Inclusions in motor neuron disease and Huntington's disease
 * Mallory bodies in alcoholic liver disease
 * Rosenthal fibers in astrocytes

Link to cancer
Post-translational modification of proteins is a generally used mechanism in eukaryotic cell signaling. Ubiquitylation, ubiquitin conjugation to proteins, is a crucial process for cell cycle progression and cell proliferation and development. Although ubiquitylation usually serves as a signal for protein degradation through the 26S proteasome, it could also serve for other fundamental cellular processes, in endocytosis, enzymatic activation and DNA repair. Moreover, since ubiquitylation functions to tightly regulate the cellular level of cyclins, its misregulation is expected to have severe impacts. First evidence of the importance of the ubiquitin/proteasome pathway in oncogenic processes was observed due to the high antitumor activity of proteasome inhibitors. Various studies have shown that defects or alterations in ubiquitylation processes are commonly associated with or present in human carcinoma. Malignancies could be developed through loss of function mutation directly at the tumor suppressor gene, increased activity of ubiquitylation, and/or indirect attenuation of ubiquitylation due to mutation in related proteins.

Renal cell carcinoma
The VHL (Von Hippel–Lindau) gene encodes a component of an E3 ubiquitin ligase. VHL complex targets a member of the hypoxia-inducible transcription factor family (HIF) for degradation by interacting with the oxygen-dependent destruction domain under normoxic conditions. HIF activates downstream targets such as the vascular endothelial growth factor (VEGF), promoting angiogenesis. Mutations in VHL prevent degradation of HIF and thus lead to the formation of hypervascular lesions and renal tumors.

Breast cancer
The BRCA1 gene is another tumor suppressor gene in humans which encodes the BRCA1 protein that is involved in response to DNA damage. The protein contains a RING motif with E3 Ubiquitin Ligase activity. BRCA1 could form dimer with other molecules, such as BARD1 and BAP1, for its ubiquitylation activity. Mutations that affect the ligase function are often found and associated with various cancers.

Cyclin E
As processes in cell cycle progression are the most fundamental processes for cellular growth and differentiation, and are the most common to be altered in human carcinomas, it is expected for cell cycle-regulatory proteins to be under tight regulation. The level of cyclins, as the name suggests, is high only at certain a time point during the cell cycle. This is achieved by continuous control of cyclins or CDKs levels through ubiquitylation and degradation. When cyclin E is partnered with CDK2 and gets phosphorylated, an SCF-associated F-box protein Fbw7 recognizes the complex and thus targets it for degradation. Mutations in Fbw7 have been found in more than 30% of human tumors, characterizing it as a tumor suppressor protein.

Cervical cancer
Oncogenic types of the human papillomavirus (HPV) are known to hijack cellular ubiquitin-proteasome pathway for viral infection and replication. The E6 proteins of HPV will bind to the N-terminus of the cellular E6-AP E3 ubiquitin ligase, redirecting the complex to bind p53, a well-known tumor suppressor gene whose inactivation is found in many types of cancer. Thus, p53 undergoes ubiquitylation and proteasome-mediated degradation. Meanwhile, E7, another one of the early-expressed HPV genes, will bind to Rb, also a tumor suppressor gene, mediating its degradation. The loss of p53 and Rb in cells allows limitless cell proliferation to occur.

p53 regulation
Gene amplification often occur in various tumor cases, including of MDM2, a gene encodes for a RING E3 Ubiquitin ligase responsible for downregulation of p53 activity. MDM2 targets p53 for ubiquitylation and proteasomal degradation thus keeping its level appropriate for normal cell condition. Overexpression of MDM2 causes loss of p53 activity and therefore allowing cells to have a limitless replicative potential.

p27
Another gene that is a target of gene amplification is SKP2. SKP2 is an F-box protein with a role in substrate recognition for ubiquitylation and degradation. SKP2 targets p27Kip-1, an inhibitor of cyclin-dependent kinases (CDKs). CDKs2/4 partner with the cyclins E/D, respectively, forming a family of cell cycle regulators which control cell cycle progression through the G1 phase. Low level of p27Kip-1 protein is often found in various cancers and is due to overactivation of ubiquitin-mediated proteolysis through overexpression of SKP2.

Efp
Efp, or estrogen-inducible RING-finger protein, is an E3 ubiquitin ligase whose overexpression has been shown to be the major cause of estrogen-independent breast cancer. Efp's substrate is 14-3-3 protein which negatively regulates cell cycle.

Colorectal cancer
The gene associated with colorectal cancer is the adenomatous polyposis coli (APC), which is a classic tumor suppressor gene. APC gene product targets beta-catenin for degradation via ubiquitylation at the N-terminus, thus regulating its cellular level. Most colorectal cancer cases are found with mutations in the APC gene. However, in cases where APC gene is not mutated, mutations are found in the N-terminus of beta-catenin which renders it ubiquitination-free and thus increased activity.

Glioblastoma
As the most aggressive cancer originated in the brain, mutations found in patients with glioblastoma are related to the deletion of a part of the extracellular domain of the epidermal growth factor receptor (EGFR). This deletion causes CBL E3 ligase unable to bind to the receptor for its recycling and degradation via a ubiquitin-lysosomal pathway. Thus, EGFR is constitutively active in the cell membrane and activates its downstream effectors that are involved in cell proliferation and migration.

Phosphorylation-dependent ubiquitylation
The interplay between ubiquitylation and phosphorylation has been an ongoing research interest since phosphorylation often serves as a marker where ubiquitylation leads to degradation. Moreover, ubiquitylation can also act to turn on/off the kinase activity of a protein. The critical role of phosphorylation is largely underscored in the activation and removal of autoinhibition in the Cbl protein. Cbl is an E3 ubiquitin ligase with a RING finger domain that interacts with its tyrosine kinase binding (TKB) domain, preventing interaction of the RING domain with an E2 ubiquitin-conjugating enzyme. This intramolecular interaction is an autoinhibition regulation that prevents its role as a negative regulator of various growth factors and tyrosine kinase signaling and T-cell activation. Phosphorylation of Y363 relieves the autoinhibition and enhances binding to E2. Mutations that render the Cbl protein dysfunctional due to the loss of its ligase/tumor suppressor function and maintenance of its positive signaling/oncogenic function have been shown to cause the development of cancer.

Screening for ubiquitin ligase substrates
Deregulation of E3-substrate interactions is a key cause of many human disorders, therefore identifying E3 ligase substrates is crucial. In 2008, 'Global Protein Stability (GPS) Profiling' was developed to discover E3 ubiquitin ligase substrates. This high-throughput system made use of reporter proteins fused with thousands of potential substrates independently. By inhibition of the ligase activity (through the making of Cul1 dominant negative thus renders ubiquitination not to occur), increased reporter activity shows that the identified substrates are being accumulated. This approach added a large number of new substrates to the list of E3 ligase substrates.

Possible therapeutic applications
Blocking of specific substrate recognition by the E3 ligases, e.g. bortezomib.

Challenge
Finding a specific molecule that selectively inhibits the activity of a certain E3 ligase and/or the protein–protein interactions implicated in the disease remains as one of the important and expanding research area. Moreover, as ubiquitination is a multi-step process with various players and intermediate forms, consideration of the much complex interactions between components needs to be taken heavily into account while designing the small molecule inhibitors.

Similar proteins
Ubiquitin is the most-understood post-translation modifier, however, several family of ubiquitin-like proteins (UBLs) can modify cellular targets in a parallel but distinct route. Known UBLs include: small ubiquitin-like modifier (SUMO), ubiquitin cross-reactive protein (UCRP, also known as interferon-stimulated gene-15 ISG15), ubiquitin-related modifier-1 (URM1), neuronal-precursor-cell-expressed developmentally downregulated protein-8 (NEDD8, also called Rub1 in S. cerevisiae), human leukocyte antigen F-associated (FAT10), autophagy-8 (ATG8) and -12 (ATG12), Few ubiquitin-like protein (FUB1), MUB (membrane-anchored UBL), ubiquitin fold-modifier-1 (UFM1) and ubiquitin-like protein-5 (UBL5, which is but known as homologous to ubiquitin-1 [Hub1] in S. pombe). Although these proteins share only modest primary sequence identity with ubiquitin, they are closely related three-dimensionally. For example, SUMO shares only 18% sequence identity, but they contain the same structural fold. This fold is called "ubiquitin fold". FAT10 and UCRP contain two. This compact globular beta-grasp fold is found in ubiquitin, UBLs, and proteins that comprise a ubiquitin-like domain, e.g. the S. cerevisiae spindle pole body duplication protein, Dsk2, and NER protein, Rad23, both contain N-terminal ubiquitin domains.

These related molecules have novel functions and influence diverse biological processes. There is also cross-regulation between the various conjugation pathways, since some proteins can become modified by more than one UBL, and sometimes even at the same lysine residue. For instance, SUMO modification often acts antagonistically to that of ubiquitination and serves to stabilize protein substrates. Proteins conjugated to UBLs are typically not targeted for degradation by the proteasome but rather function in diverse regulatory activities. Attachment of UBLs might, alter substrate conformation, affect the affinity for ligands or other interacting molecules, alter substrate localization, and influence protein stability.

UBLs are structurally similar to ubiquitin and are processed, activated, conjugated, and released from conjugates by enzymatic steps that are similar to the corresponding mechanisms for ubiquitin. UBLs are also translated with C-terminal extensions that are processed to expose the invariant C-terminal LRGG. These modifiers have their own specific E1 (activating), E2 (conjugating) and E3 (ligating) enzymes that conjugate the UBLs to intracellular targets. These conjugates can be reversed by UBL-specific isopeptidases that have similar mechanisms to that of the deubiquitinating enzymes.

Within some species, the recognition and destruction of sperm mitochondria through a mechanism involving ubiquitin is responsible for sperm mitochondria's disposal after fertilization occurs.

Prokaryotic origins
Ubiquitin is believed to have descended from bacterial proteins similar to ThiS or MoaD. These prokaryotic proteins, despite having little sequence identity (ThiS has 14% identity to ubiquitin), share the same protein fold. These proteins also share sulfur chemistry with ubiquitin. MoaD, which is involved in molybdopterin biosynthesis, interacts with MoeB, which acts like an E1 ubiquitin-activating enzyme for MoaD, strengthening the link between these prokaryotic proteins and the ubiquitin system. A similar system exists for ThiS, with its E1-like enzyme ThiF. It is also believed that the Saccharomyces cerevisiae protein Urm1, a ubiquitin-related modifier, is a "molecular fossil" that connects the evolutionary relation with the prokaryotic ubiquitin-like molecules and ubiquitin.

Archaea have a functionally closer homolog of the ubiquitin modification system, where "sampylation" with SAMPs (small archaeal modifier proteins) is performed. The sampylation system only uses E1 to guide proteins to the proteosome. Proteoarchaeota, which are related to the ancestor of eukaryotes, possess all of the E1, E2, and E3 enzymes plus a regulated Rpn11 system. Unlike SAMP which are more similar to ThiS or MoaD, Proteoarchaeota ubiquitin are most similar to eukaryotic homologs.

Prokaryotic ubiquitin-like protein (Pup) and ubiquitin bacterial (UBact)
Prokaryotic ubiquitin-like protein (Pup) is a functional analog of ubiquitin which has been found in the gram-positive bacterial phylum Actinomycetota. It serves the same function (targeting proteins for degradations), although the enzymology of ubiquitylation and pupylation is different, and the two families share no homology. In contrast to the three-step reaction of ubiquitylation, pupylation requires two steps, therefore only two enzymes are involved in pupylation.

In 2017, homologs of Pup were reported in five phyla of gram-negative bacteria, in seven candidate bacterial phyla and in one archaeon The sequences of the Pup homologs are very different from the sequences of Pup in gram-positive bacteria and were termed Ubiquitin bacterial (UBact), although the distinction has yet not been proven to be phylogenetically supported by a separate evolutionary origin and is without experimental evidence.

The finding of the Pup/UBact-proteasome system in both gram-positive and gram-negative bacteria suggests that either the Pup/UBact-proteasome system evolved in bacteria prior to the split into gram positive and negative clades over 3000 million years ago or, that these systems were acquired by different bacterial lineages through horizontal gene transfer(s) from a third, yet unknown, organism. In support of the second possibility, two UBact loci were found in the genome of an uncultured anaerobic methanotrophic Archaeon (ANME-1;locus CBH38808.1 and locus CBH39258.1).

Human proteins containing ubiquitin domain
These include ubiquitin-like proteins.

ANUBL1;   BAG1;      BAT3/BAG6;       C1orf131;       DDI1;      DDI2;      FAU;       HERPUD1;   HERPUD2; HOPS;      IKBKB;     ISG15;     LOC391257; MIDN;      NEDD8;     OASL;      PARK2; RAD23A;    RAD23B;    RPS27A;    SACS;     8U SF3A1;     SUMO1;     SUMO2;     SUMO3; SUMO4;     TMUB1;     TMUB2;     UBA52;     UBB;       UBC;       UBD;       UBFD1; UBL4A;     UBL4B;     UBL7;      UBLCP1;    UBQLN1;    UBQLN2;    UBQLN3; UBQLN4;    UBQLNL;    UBTD1;     UBTD2;     UHRF1;     UHRF2;

Related proteins

 * Ubiquitin-associated protein domain

Prediction of ubiquitination
Currently available prediction programs are:
 * UbiPred is a SVM-based prediction server using 31 physicochemical properties for predicting ubiquitylation sites.
 * UbPred is a random forest-based predictor of potential ubiquitination sites in proteins. It was trained on a combined set of 266 non-redundant experimentally verified ubiquitination sites available from our experiments and from two large-scale proteomics studies.
 * CKSAAP_UbSite is SVM-based prediction that employs the composition of k-spaced amino acid pairs surrounding a query site (i.e. any lysine in a query sequence) as input, uses the same dataset as UbPred.

Podcast
Investigating the ubiquitin proteasome system was the focus of a Dementia Researcher Podcast. The podcast was published on 16 August 2021, hosted by Professor Selina Wray from University College London.