SoyBase Database

SoyBase is a database created by the United States Department of Agriculture. It contains genetic information about soybeans. It includes genetic maps, information about Mendelian genetics and molecular data regarding genes and sequences. It was started in 1990 and is freely available to individuals and organizations worldwide.

History
SoyBase was instituted by the Corn Insects and Corn Genetics Research Unit (CICGRU) in Ames, Iowa as a central repository for the soybean genetics community's published information. Originally, the database concentrated on genetic information such as genetic linkage maps and other Mendalian information. SoyBase genetic maps are a manually-curated composite of all published mapping and QTL studies, and thus provide a species level view of markers and QTL.

In 2010 the soybean genome sequence was released along with gene models and many other types of genome annotations that were integrated in to SoyBase. SoyBase genetic linkage maps were integral to the assembly of the soybean genome.

In 2018 the database received approximately 63,000 page requests from 2,600 users per month from 130 countries. About 40 organizations in the United States and 82 foreign educational institutions access SoyBase yearly. SoyBase supplies data to U.S. and foreign government organizations and corporate entities.

Data submission and release policy
Data is accepted from the original source generators only. Users that independently identify data for inclusion into the database can contact SoyBase directly. A number Excel-based spreadsheet templates are available to facilitate the inclusion of data into SoyBase.

All data in SoyBase are available without restrictions. A number of data sub-setting and download tools are provided, and when needed ad hoc subsets of the data can be requested from the SoyBase Curator.

Search tool
The SoyBase Database Search Tool uses a text entry box for queries. Results are returned as text and as displays. Results display soybean genetic (and genomic) data using Generic Model Organism Database (GMOD) open-source software. In addition to SoyBase, objects identified by exact lexical matches to the query term, the tool also uses a soybean-specific ontology to identify biologically-related SoyBase objects.

Some SoyBase sequence data and annotations are available through an InterMine instance (SoyMine), which is a collaboration with the Legume Information System Project.

Graphical displays
Genetic maps contain information on markers (SSR, RFLP, SNP, etc.), genes, and biparental and Genome-wide Association Study (GWAS) Quantitative Trait Loci (QTL). Soybean genetic maps are displayed using the CMap comparative genetic map viewer. Soybean genomic sequence and gene model data are displayed using the GBrowse sequence viewer. Other genome annotations in this viewer include epigenetic data such as DNA methylation and gene expression data of various soybean strains subjected to different treatments and from different soybean tissues/cultivars. Metabolic data and biochemical pathway information is displayed using Pathway Tools. Soybean metabolic pathway information (SoyCyc) was inferred by the Plant Metabolic Network project and was used to populate PathwayTools displays.