User:Still life with noodles/Taxonomic database

Goals
Taxonomic databases digitize scientific biodiversity data and provide access to taxonomic data for research. Taxonomic databases vary in breadth of the groups of taxa and geographical space they seek to include, for example: beetles in a defined region, mammals globally, or all described taxa in the tree of life. A taxonomic database may incorporate organism identifiers (scientific name, author, and – for zoological taxa – year of original publication), synonyms, taxonomic opinions, literature sources or citations, illustrations or photographs, and biological attributes for each taxon (such as geographic distribution, ecology, descriptive information, threatened or vulnerable status, etc.). Some databases, such as the Global Biodiversity Information Facility (GBIF) database and the Barcode of Life Data System, store the DNA barcode of a taxon if one exists (also called the Barcode Index Number (BIN) which may be assigned, for example, by the International Barcode of Life project (iBOL) or UNITE, a database for fungal DNA barcoding).

A taxonomic database aims to accurately model the characteristics of interest that are relevant to the organisms which are in scope for the intended coverage and usage of the system. For example, databases of fungi, algae, bryophytes and vascular plants ("higher plants") encode conventions from the International Code of Botanical Nomenclature while their counterparts for animals and most protists encode equivalent rules from the International Code of Zoological Nomenclature. Modelling the relevant taxonomic hierarchy for any taxon is a natural fit with the relational model employed in almost all database systems. Scientific consensus is not reached for all taxon groups, and new species continue to be described; therefore, another goal of taxonomic databases is to aid in resolving conflicts of scientific opinion and unify taxonomy.

Issues
The representation of taxonomic information in machine-encodable form raises a number of issues not encountered in other domains, such as variant ways to cite the same species or other taxon name, the same name used for multiple taxa (homonyms), multiple non-current names for the same taxon (synonyms), changes in name and taxon concept definition through time, and more. Non-standardized categories and metadata in taxonomic databases hampers the ability for researchers to analyze the data. One forum that has promoted discussion and possible solutions to these and related problems since 1985 is the Biodiversity Information Standards (TDWG), originally called the Taxonomic Database Working Group.

While online databases have great benefits, such as increased access to taxonomic information, they also have issues such as data integrity risks in on- and off-line versions due to continuous updates, technical access issues due to server or internet outage, and differing capacities for complex queries to extract taxonomic data into lists. As the quantity of information in online taxonomic databases rapidly expands, data aggregation, and the integration and alignment of non-standardized data across databases, is a big challenge in taxonomy and biodiversity informatics.