User:MWGSchneider/sandbox

Development of the formylglycine aldehyde tag
The aldehyde tag is an artificial peptide tag recognized by the formylglycine-generating enzyme (FGE). Formylglycine is a glycine with a formyl group (-CHO, an aldehyde) at the α-carbon. The sulfatase motif is the basis for the sequence of the peptide which results in the site-specific conversion of a cysteine to a formylglycine residue. The peptide tag was engineered after studies on FGE recognizable sequences in sulfatases from different organisms. Carrico et al. discovered a high homology in the sulfatase motif in bacteria, archaea as well as eukaryotes. Aldehydes and ketones find use as chemical reporters due to their strong electrophilic properties. This enables a reaction under mild conditions when using a strong nucleophilic coupling partner. Typically, hydrazides and aminooxy probes are used in bioconjugation. They form stabilized addition products with carbonyl groups that are favoured under the physiological reaction conditions. At neutral pH, the equilibrium of Schiff base formation, is lying far to the reactant’s side. To make more product named compounds are used to form stable hydrazones and oximes. Since the pH-optimum of 4 to 6 cannot be achieved by adding a catalyst due to associated toxicity, the reaction is slow in live cells. A typical reaction constant is 10-4 to 10-3 M-1 s-1. A carbonyl group is introduced into proteins as a chemical reporter using different techniques, including modern methods like stop codon suppression and the herein discussed aldehyde tag,. Limiting the use of aldehydes and ketones is their restricted bioorthogonality in certain cellular environments. Limitations of aldehydes and ketones as chemical reporters are due to
 * competition with endogenous aldehydes or ketones in metabolites and cofactors. Lower yields and impaired specificity can occur.
 * side reactions like oxidation or unwanted addition of endogenous nucleophiles.
 * restrained set of probes that form sufficiently stable products,.

Aldehydes and ketones are therefore best used in compartments where such unwanted side reactions are decreased. For experiments with life cells, cell surfaces and extracellular space are typical fielding areas. Nevertheless, a feature of carbonyl groups is the vast number of organic reactions that involve them as electrophiles. Some of these reactions are readily convertible to ligations for probing aldehydes. A rather exotic reaction recently employed for bioconjugation by Agarwal et al. is the adaptation of the Pictet-Spengler-reaction as a ligation. The reaction is known from natural product biosynthetic pathways and has the major advantage that a new carbon-carbon bond is formed. This guarantees long-term stability compared to carbon-heteroatom bonds at same reaction kinetics. The modification of cysteine or, more rarely, serine by FGE is a rather unusual posttranslational modification and was discovered already in the late 1990s. Interestingly, the deficiency of FGE leads to an overall deficiency of functional sulfatases due to a lack of α-formylglycine formation vital for the sulfatases to perform their function. FGE is essential for protein modification and need of high specificity and conversion rate is given in the native setting, which makes this reaction interesting for chemical and synthetic biology. Carrico et al. pioneered the insertion of the modified sulfatase motif peptide into proteins of interest in 2007. Such use of aldehydes and ketones as a chemical reporter in bioorthogonal applications has been applied in self-assembly of cell-lysing drugs, the targeting of proteins , as well as glycans and the preparation of heterobifunctional fusion proteins since then.

Genetically encoding the aldehyde tag
The formylglycine tag or aldehyde tag is a convenient 6- or 13-amino acids long tag fused to a protein of interest. The 6-mer tag represents the small core consensus sequence and the 13-mer tag the longer full motif. The experiments on the genetically encoded aldehyde tag by Carrico et al. clearly showed the high conversion efficiency with only the core consensus sequence present. Four proteins were produced recombinantly in E.coli with a 86 % efficiency of for the full-length motif and >90 % efficiency for the 6-mer determined by mass spectrometry. The size of the sequence is analogous to the commonly used 6x His-Tag and has the advantage that it can also be genetically encoded. The sequence is recognized in the ER solely depending on primary sequence and subsequently targeted by FGE. Notably, in the setup of recombinant expression proteins in E. coli a coexpression of exogenous FGE aids full conversion [1], although E. coli has endogenous FGE-activity. The introduction of an aldehyde tag as proposed by Carrico et al. has a workflow that consists of three segments: A the expression of the fusion protein, that carries the peptide tag derived from the sulfatase motif, B the enzymatic conversion of Cys to f(Gly) and C the bioorthogonal probing with hydrazides or alkoxy amines (Fig. 1).



As seen in Fig. 1, the engineered aldehyde tag consists of six amino acids. A set of organisms from all domains of life was chosen and the sequence homology of the sulfatase motif was determined. The sequence used is the best consensus for sequences found in bacteria, archaea, worms and higher vertebrates. .

FGE-mechanism of cysteine-formylglycine conversion
The catalytic mechanism of FGE is well studied. A multistep redox reaction with a covalent enzyme:substrate intermediate is proposed. The role of the cysteine residue for the occurring conversion was studied by mutating the cysteine to alanine. No conversion was found using mass spectrometry when the mutated peptide tag was used. The mechanism shows the important role of the redox active thiol group of cysteine in the formation of f(Gly), as seen in Fig. 2. The key step of the catalytic cycle is the monooxidation of the cysteine residue of the enzyme, forming a reactive sulfenic acid intermediate. Subsequently, the hydroxyl group is transferred to the cysteine of the substrate and after hetero-analogous β-elimination of H2O, a thioaldehyde is formed. This compound is very reactive and easily hydrolyzed, releasing the aldehyde and a molecule of H2S, ,.