Structured digital abstract

A Structured Digital Abstract (SDA) is a method of describing relationships between biological entities in a structured, but human-readable, format. It is added below the abstract of scientific articles published in FEBS Letters and FEBS Journal. Current SDAs describe protein-protein interactions.

History
Many scientific manuscripts describe relationships between entities such as genes and proteins. However, this information cannot be used efficiently because of the difficulties in retrieving it automatically from unstructured text. In a six-month pilot project that started in January 2008, FEBS Letters began publishing manuscripts with “structured digital abstracts” (SDAs). The SDAs were added to the end of abstracts in a structured, but human-readable, format and digitally linked to interaction databases. In the pilot project, the journal concentrated on protein-protein interactions. After six months, this “experiment” was evaluated. As it was a success, all appropriate FEBS Letters manuscripts are now given an SDA. In 2009, FEBS Journal also started publishing manuscripts with SDAs. The SDA initiative continues to be funded by FEBS, a not-for-profit organisation. Recent BioCreative challenges have focused on protein-protein interaction extraction by automatic text mining, using FEBS Letters and FEBS Journal articles.

Format
An SDA comprises a series of sentences each of which contains a relationship between two biological entities, mentioning the method used to study the relationship. To provide a simplified example: protein A interacts with protein B, by method X. Each sentence in an SDA is followed by one or more identifiers pointing to the corresponding database entries that contain all the details of the structured information. Although most of the sentences currently point to the MINT Molecular INTeraction Database, the proposed structure can easily be extended to contain identifiers from other databases storing protein interactions or different types of relationships between biological entities. Each entity is also linked to the appropriate explanatory database. e.g. UniProtKB for proteins and the European Bioinformatics Institute ontology look up service for other entities.