User:Llyon225/sandbox

Need comprehensive description of the genome sequencing methodology, assembly and annotation techniques

'''Looks good now on genome statistics (size, GC content, # of genes, etc.), and a catalog of metabolic potential based on KEGG. *Also got feedback from Thrash that "Genome sequencing methods incorrect as the reported methods only focused on 16S sequencing"'''

Text from Genome section of Deinococcus frigens page to be added to:

Taxonomy
Deinococcus frigens is an extremophilic, gram-positive cocci bacterium. The Deinococcus genus is generally known for its resistance to very large doses of radiation, and the species D. frigens is no exception. The species designation “frigens” refers to the harsh, cold climate of Antarctica where this microbe is found. DNA sequences from six isolates found in the McMurdo Valley were determined by extraction of the genomic DNA, PCR amplification of the 16S rDNA, and analysis of the PCR product sequences. High molecular DNA was obtained and purified by using Marmur's technique of lysing the cells, centrifuging the cell debris off, denaturing proteins, removing RNA with RNase, and precipitating the DNA with isopropanol. The 16S rDNA sequences amplified from PCR were then aligned with the sequences of previously identified bacterial lines of decent. Using sequence databases, these six isolates were shown to all be related to the Deinococcus lineage; however, they form three coherent clusters, separate from other Deinococcus species. DNA-DNA similarity data, obtained using the DNA hybridization technique, shows that these three clusters represented three new species of Deinococcus, and were given the names D. frigens, Deinococcus saxicola and Deinococcus marmoris. Using 16s rRNA sequencing as a basis of comparison, D. frigens has been found to have a 97.3% similarity with D. saxicola and a 96.6% similarity to D. marmoris. The closest relative to these three more recently-discovered species is Deinococcus radiopugnans, which has a genome with a 96.1% similarity. The full scientific classification of this species is Kingdom Bacteria, Phylum Deinococcus-Thermus, Class Deinococci, Order Deinococcales, Family Deinococcaceae, Genus Deinococcus, Species D. frigens.

The full genome of D. frigens was sequenced by the DOE Joint Genome institute using the sequencing technology Illumina HiSeq 2000. The genome was then annotated using the standard procedures of the DOE-JGI Microbial Genome Annotation Pipeline by quality control pre-processing, structural annotation, and functional annotation. The assembly method was vpAllpaths v.r46652, and the gene calling method used was Prodigal 2.5. This information was collected and entered into the Joint Genome Institute's database by Dr. Nikos Kyrpides and Dr. Tanja Woyke.

The genome of D. frigens is made up of 2,015,889 base pairs of DNA with a GC-content of 65.5%. Of the 4057 genes found in D. frigens, 3987 are protein-coding. The JGI IMG database shows genes which are found within D. frigens and associated with metabolic pathways found in the KEGG database. For carbohydrate metabolism, the genome of D. frigens contains genes necessary for the TCA cycle, glycolysis, gluconeogenesis, the pentose phosphate pathway, and pyruvate metabolism. KEGG also shows that D. frigens possesses proteins which can allow for the metabolism of fructose to glucose, galactose to glucose, and the entirety of the glycolysis pathway. Additionally, the genome includes genes necessary for extracellular nitrate and nitrite transport, assimilatory reduction of nitrite to ammonia, assimilatory reduction of nitrate to nitrite, and sulfite reduction The electron transport chain of D. frigens is made up of five complexes: NADH dehydrogenase, succinate dehydrogenase, cytochrome bc1 complex, cytochrome c oxidase, and ATP synthase. Unlike its close relative, Deinococcus radiodurans, D. frigens has no flagellar assembly for movement.