Computational gene



A computational gene  is a molecular automaton consisting of a structural part and a functional part; and its design is such that it might work in a cellular environment.

The structural part is a naturally occurring gene, which is used as a skeleton to encode the input and the transitions of the automaton (Fig. 1A). The conserved features of a structural gene (e.g., DNA polymerase binding site, start and stop codons, and splicing sites) serve as constants of the computational gene, while the coding regions, the number of exons and introns, the position of start and stop codon, and the automata theoretical variables (symbols, states, and transitions) are the design parameters of the computational gene. The constants and the design parameters are linked by several logical and biochemical constraints (e.g., encoded automata theoretic variables must not be recognized as splicing junctions). The input of the automaton are molecular markers given by single stranded DNA (ssDNA) molecules. These markers are signalling aberrant (e.g., carcinogenic) molecular phenotype and turn on the self-assembly of the functional gene. If the input is accepted, the output encodes a double stranded DNA (dsDNA) molecule, a functional gene which should be successfully integrated into the cellular transcription and translation machinery producing a wild type protein or an anti-drug (Fig. 1B). Otherwise, a rejected input will assemble into a partially dsDNA molecule which cannot be translated.

A potential application: in situ diagnostics and therapy of cancer
Computational genes might be used in the future to correct aberrant mutations in a gene or group of genes that can trigger disease phenotypes. One of the most prominent examples is the tumor suppressor p53 gene, which is present in every cell, and acts as a guard to control growth. Mutations in this gene can abolish its function, allowing uncontrolled growth that can lead to cancer. For instance, a mutation at codon 249 in the p53 protein is characteristic for hepatocellular cancer. This disease could be treated by the CDB3 peptide which binds to the p53 core domain and stabilises its fold.

A single disease-related mutation can be then diagnosed and treated by the following diagnostic rule: "(1)"





Such a rule might be implemented by a molecular automaton consisting of two partially dsDNA molecules and one ssDNA molecule, which corresponds to the disease-related mutation and provides a molecular switch for the linear self-assembly of the functional gene (Fig. 2). The gene structure is completed by a cellular ligase present in both eukaryotic and prokaryotic cells. The transcription and translation machinery of the cell is then in charge of therapy and administers either a wild-type protein or an anti-drug (Fig. 3). The rule (1) may even be generalised to involve mutations from different proteins allowing a combined diagnosis and therapy.

In this way, computational genes might allow implementation in situ of a therapy as soon as the cell starts developing defective material. Computational genes combine the techniques of gene therapy which allows to replace in the genome an aberrant gene by its healthy counterpart, as well as to silence the gene expression (similar to antisense technology).

Challenges
Although mechanistically simple and quite robust on molecular level, several issues need to be addressed before an in vivo implementation of computational genes can be considered.

First, the DNA material must be internalised into the cell, specifically into the nucleus. In fact, the transfer of DNA or RNA through biological membranes is a key step in the drug delivery. Some results show that nuclear localisation signals can be irreversibly linked to one end of the oligonucleotides, forming an oligonucleotide-peptide conjugate that allows effective internalisation of DNA into the nucleus.

In addition, the DNA complexes should have low immunogenicity to guarantee their integrity in the cell and their resistance to cellular nucleases. Current strategies to eliminate nuclease sensitivity include modifications of the oligonucleotide backbone such as methylphosphonate and phosphorothioate (S-ODN) oligodeoxynucleotides, but along with their increased stability, modified oligonucleotides often have altered pharmacologic properties.

Finally, similar to any other drug, DNA complexes could cause nonspecific and toxic side effects. In vivo applications of antisense oligonucleotides showed that toxicity is largely due to impurities in the oligonucleotide preparation and lack of specificity of the particular sequence used.

Undoubtedly, progress on antisense biotechnology will also result in a direct benefit to the model of computational genes.