Signal peptide

A signal peptide (sometimes referred to as signal sequence, targeting signal, localization signal, localization sequence, transit peptide, leader sequence or leader peptide) is a short peptide (usually 16-30 amino acids long) present at the N-terminus (or occasionally nonclassically at the C-terminus or internally) of most newly synthesized proteins that are destined toward the secretory pathway. These proteins include those that reside either inside certain organelles (the endoplasmic reticulum, Golgi or endosomes), secreted from the cell, or inserted into most cellular membranes. Although most type I membrane-bound proteins have signal peptides, most type II and multi-spanning membrane-bound proteins are targeted to the secretory pathway by their first transmembrane domain, which biochemically resembles a signal sequence except that it is not cleaved. They are a kind of target peptide.

Function (translocation)
Signal peptides function to prompt a cell to translocate the protein, usually to the cellular membrane. In prokaryotes, signal peptides direct the newly synthesized protein to the SecYEG protein-conducting channel, which is present in the plasma membrane. A homologous system exists in eukaryotes, where the signal peptide directs the newly synthesized protein to the Sec61 channel, which shares structural and sequence homology with SecYEG, but is present in the endoplasmic reticulum. Both the SecYEG and Sec61 channels are commonly referred to as the translocon, and transit through this channel is known as translocation. While secreted proteins are threaded through the channel, transmembrane domains may diffuse across a lateral gate in the translocon to partition into the surrounding membrane.

Structure
The core of the signal peptide contains a long stretch of hydrophobic amino acids (about 5–16 residues long) that has a tendency to form a single alpha-helix and is also referred to as the "h-region". In addition, many signal peptides begin with a short positively charged stretch of amino acids, which may help to enforce proper topology of the polypeptide during translocation by what is known as the positive-inside rule. Because of its close location to the N-terminus it is called the "n-region". At the end of the signal peptide there is typically a stretch of amino acids that is recognized and cleaved by signal peptidase and therefore named cleavage site. This cleavage site is absent from transmembrane-domains that serve as signal peptides, which are sometimes referred to as signal anchor sequences. Signal peptidase may cleave either during or after completion of translocation to generate a free signal peptide and a mature protein. The free signal peptides are then digested by specific proteases. Moreover, different target locations are aimed by different types of signal peptides. For example, the structure of a target peptide aiming for the mitochondrial environment differs in terms of length and shows an alternating pattern of small positively charged and hydrophobic stretches. Nucleus aiming signal peptides can be found at both the N-terminus and the C-terminus of a protein and are in most cases retained in the mature protein.

It is possible to determine the amino acid sequence of the N-terminal signal peptide by Edman degradation, a cyclic procedure that cleaves off the amino acids one at a time.

Co-translational versus post-translational translocation
In both prokaryotes and eukaryotes signal sequences may act co-translationally or post-translationally.

The co-translational pathway is initiated when the signal peptide emerges from the ribosome and is recognized by the signal-recognition particle (SRP). SRP then halts further translation (translational arrest only occurs in Eukaryotes) and directs the signal sequence-ribosome-mRNA complex to the SRP receptor, which is present on the surface of either the plasma membrane (in prokaryotes) or the ER (in eukaryotes). Once membrane-targeting is completed, the signal sequence is inserted into the translocon. Ribosomes are then physically docked onto the cytoplasmic face of the translocon and protein synthesis resumes.

The post-translational pathway is initiated after protein synthesis is completed. In prokaryotes, the signal sequence of post-translational substrates is recognized by the SecB chaperone protein that transfers the protein to the SecA ATPase, which in turn pumps the protein through the translocon. Although post-translational translocation is known to occur in eukaryotes, it is poorly understood. It is known that in yeast post-translational translocation requires the translocon and two additional membrane-bound proteins, Sec62 and Sec63.

Secretion efficiency determination
Signal peptides are extremely heterogeneous, many prokaryotic and eukaryotic ones are functionally interchangeable within or between species and all determine protein secretion efficiency.

Nucleotide level features
In vertebrates, the region of the mRNA that codes for the signal peptide (i.e. the signal sequence coding region, or SSCR) can function as an RNA element with specific activities. SSCRs promote nuclear mRNA export and the proper localization to the surface of the endoplasmic reticulum. In addition SSCRs have specific sequence features: they have low adenine-content, are enriched in certain motifs, and tend to be present in the first exon at a frequency that is higher than expected.

Alternate secretion mechanisms
Proteins without signal peptides can also be secreted by unconventional mechanisms. E.g. Interleukin, Galectin. The process by which such secretory proteins gain access to the cell exterior is termed unconventional protein secretion (UPS). In plants, even 50% of secreted proteins can be UPS dependent.

Nonclassical sequences
Signal peptides are usually located at the N-terminus of proteins. Some have C-terminal or internal signal peptides (examples: peroxisomal targeting signal and nuclear localisation signal). The structure of these nonclassical signal peptides differs vastly from the N-terminal signal peptides.

Nomenclature
Signal peptides are not to be confused with the leader peptides sometimes encoded by leader mRNA, although both are sometimes ambiguously referred to as "leader peptides." These other leader peptides are short polypeptides that do not function in protein localization, but instead may regulate transcription or translation of the main protein, and are not part of the final protein sequence. This type of leader peptide primarily refers to a form of gene regulation found in bacteria, although a similar mechanism is used to regulate eukaryotic genes, which is referred to as uORFs (upstream open reading frames).