Nonribosomal code

The nonribosomal code refers to key amino acid residues and their positions within the primary sequence of an adenylation domain of a nonribosomal peptide synthetase used to predict substrate specificity and thus (partially) the final product. Analogous to the nonribosomal code is prediction of peptide composition by DNA/RNA codon reading, which is well supported by the central dogma of molecular biology and accomplished using the genetic code simply by following the DNA codon table or RNA codon table. However, prediction of natural product/secondary metabolites by the nonribosomal code is not as concrete as DNA/RNA codon-to-amino acid and much research is still needed to have a broad-use code. The increasing number of sequenced genomes and high-throughput prediction software has allowed for better elucidation of predicted substrate specificity and thus natural products/secondary metabolites. Enzyme characterization by, for example, ATP-pyrophosphate exchange assays for substrate specificity, in silico substrate-binding pocket modelling and structure-function mutagenesis (in vitro tests or in silico modelling) helps support predictive algorithms. Much research has been done on bacteria and fungi, with prokaryotic bacteria having easier-to-predict products.

The nonribosomal peptide synthetase (NRPS), a multi-modular enzyme complex, minimally contains repeating, tri-domains (adenylation (A), peptidyl carrier protein (PCP) and lastly condensation(C)). The adenylation domain (A) is the focus for substrate specificity since it is the initiating and substrate recognition domain. In one example, adenylation substrate-binding pocket (defined by 10 residue within) alignments led to clusters giving rise to defined specificity (i.e. the residues of the enzyme pocket can predict nonribosomal peptide sequence). In silico mutations of substrate-determining residues also led to varying or relaxed specificity. Additionally, the NRPS collinearity principle/rule dictates that given the order of adenylation domains (and substrate-specificity code) throughout the NRPS one can predict the amino acid sequence of the produced small peptide. NRPS, NRPS-like or NRPS-PKS complexes also exist and have domain variations, additions and/or exclusions.

Supporting examples
The A-domains have 8 amino acid-long non-ribosomal signatures.

LTKVGHIG → Asp (Aspartic acid)

VGEIGSID → Orn (Orinithine)

AWMFAAVL → Val (Valine)