Structural gene

A structural gene is a gene that codes for any RNA or protein product other than a regulatory factor (i.e. regulatory protein). A term derived from the lac operon, structural genes are typically viewed as those containing sequences of DNA corresponding to the amino acids of a protein that will be produced, as long as said protein does not function to regulate gene expression. Structural gene products include enzymes and structural proteins. Also encoded by structural genes are non-coding RNAs, such as rRNAs and tRNAs (but excluding any regulatory miRNAs and siRNAs).

Placement in the genome
In prokaryotes, structural genes of related function are typically adjacent to one another on a single strand of DNA, forming an operon. This permits simpler regulation of gene expression, as a single regulatory factor can affect transcription of all associated genes. This is best illustrated by the well-studied lac operon, in which three structural genes (lacZ, lacY, and lacA) are all regulated by a single promoter and a single operator. Prokaryotic structural genes are transcribed into a polycistronic mRNA and subsequently translated.

In eukaryotes, structural genes are not sequentially placed. Each gene is instead composed of coding exons and interspersed non-coding introns. Regulatory sequences are typically found in non-coding regions upstream and downstream from the gene. Structural gene mRNAs must be spliced prior to translation to remove intronic sequences. This in turn lends itself to the eukaryotic phenomenon of alternative splicing, in which a single mRNA from a single structural gene can produce several different proteins based on which exons are included. Despite the complexity of this process, it is estimated that up to 94% of human genes are spliced in some way. Furthermore, different splicing patterns occur in different tissue types.

An exception to this layout in eukaryotes are genes for histone proteins, which lack introns entirely. Also distinct are the rDNA clusters of structural genes, in which 28S, 5.8S, and 18S sequences are adjacent, separated by short internally transcribed spacers, and likewise the 45S rDNA occurs five distinct places on the genome, but is clustered into adjacent repeats. In eubacteria these genes are organized into operons. However, in archaebacteria these genes are non-adjacent and exhibit no linkage.

Role in human disease
The identification of the genetic basis for the causative agent of a disease can be an important component of understanding its effects and spread. Location and content of structural genes can elucidate the evolution of virulence, as well as provide necessary information for treatment. Likewise understanding the specific changes in structural gene sequences underlying a gain or loss of virulence aids in understanding the mechanism by which diseases affect their hosts.

For example, Yersinia pestis (the bubonic plague) was found to carry several virulence and inflammation-related structural genes on plasmids. Likewise, the structural gene responsible for tetanus was determined to be carried on a plasmid as well. Diphtheria is caused by a bacterium, but only after that bacterium has been infected by a bacteriophage carrying the structural genes for the toxin.

In Herpes simplex virus, the structural gene sequence responsible for virulence was found in two locations in the genome despite only one location actually producing the viral gene product. This was hypothesized to serve as a potential mechanism for strains to regain virulence if lost through mutation.

Understanding the specific changes in structural genes underlying a gain or loss of virulence is a necessary step in the formation of specific treatments, as well the study of possible medicinal uses of toxins.

Phylogenetics
As far back as 1974, DNA sequence similarity was recognized as a valuable tool for determining relationships among taxa. Structural genes in general are more highly conserved due to functional constraint, and so can prove useful in examinations of more disparate taxa. Original analyses enriched samples for structural genes via hybridization to mRNA.

More recent phylogenetic approaches focused on structural genes of known function, conserved to varying degrees. rRNA sequences frequent targets, as they are conserved in all species. Microbiology has specifically targeted the 16S gene to determine species level differences. In higher-order taxa, COI is now considered the “barcode of life,” and is applied for most biological identification.

Debate
Despite the widespread classification of genes as either structural or regulatory, these categories are not an absolute division. Recent genetic discoveries call into question the distinction between regulatory and structural genes.

The distinction between regulatory and structural genes can be attributed to the original 1959 work on Lac operon protein expression. In this instance, a single regulatory protein was detected that affected the transcription of the other proteins now known to compose the Lac operon. From this point forward, the two types of coding sequences were separated.

However, increasing discoveries of gene regulation suggest greater complexity. Structural gene expression is regulated by numerous factors including epigenetics (e.g. methylation), RNAi, and more. Regulatory and structural genes can be epigenetically regulated identically, so not all regulation is coded for by “regulatory genes”.

There are also examples of proteins that do not decidedly fit either category, such as chaperone proteins. These proteins aid in the folding of other proteins, a seemingly regulatory role. Yet these same proteins also aid in the movement of their chaperoned proteins across membranes, and have now been implicated in immune responses (see Hsp60) and in the apoptotic pathway (see Hsp70).

More recently, microRNAs were found to be produced from the internal transcribed spacers of rRNA genes. Thus an internal component of a structural gene is, in fact, regulatory. Binding sites for microRNAs were also detected within coding sequences of genes. Typically interfering RNAs target the 3’UTR, but inclusion of binding sites within the sequence of the protein itself allows the transcripts of these proteins to effectively regulate the microRNAs within the cell. This interaction was demonstrated to have an effect on expression, and thus again a structural gene contains a regulatory component.