User:Daniel Mietchen/Talks/JATS-Con 2015

Watch the video

About
This page belongs to a paper presented on April 22, 2015 (from 9.00 to 9.45am EDT) as part of JATS-Con 2015 in the Lister Hill Auditorium at the National Library of Medicine in Bethesda, Maryland.

Title
Adapting JATS to support data citation

Authors
Daniel Mietchen, Johanna McEntyre, Jeff Beck, Chris Maloney; Force11 Data Citation Implementation Group

Abstract
Data referred to in articles is usually not cited in a consistent or structured fashion. To address this, Force 11 have developed the Joint Declaration of Data Citation Principles. JATS 1.1d1 has provisions for citing articles and other sources, but does not offer straightforward ways of expressing some of the concepts needed for data citation. In order to facilitate the citation of data in JATS-tagged documents in a way that is compliant with the Joint Declaration of Data Citation Principles, the Force11 Data Citation Implementation Group held a meeting in June of last year, at which several new elements, attributes and values for attributes were suggested to be added to JATS. These have since been submitted to the JATS Standing Committee, which largely accepted them, so they are now included in the draft standard JATS 1.1d2. This talk will provide background on the decision criteria behind the elements that were proposed, and how they were selected for JATS 1.1d2. It will in addition provide suggested examples for use of the new tags.

The full paper is available via http://www.ncbi.nlm.nih.gov/books/NBK280240/.

Formats

 * wiki
 * HTML: desktop · mobile
 * PDF
 * XML
 * Wikiwand

FAIR data Guiding Principles

 * Data Objects (Identifiable Data Item with Data elements + Metadata + an Identifier) should be
 * Findable
 * Accessible
 * Interoperable
 * Reusable

Data Citation Principles

 * Data Citation Synthesis Group: Joint Declaration of Data Citation Principles. Martone M. (ed.) San Diego CA: FORCE11; 2014 [https://www.force11.org/datacitation].
 * The principles include
 * Evidence
 * In scholarly literature, whenever and wherever a claim relies upon data, the corresponding data should be cited.
 * Unique Identification
 * A data citation should include a persistent method for identification that is machine actionable, globally unique, and widely used by a community.
 * Access
 * Data citations should facilitate access to the data themselves and to such associated metadata, documentation, code, and other materials, as are necessary for both humans and machines to make informed use of the referenced data.
 * Interoperability and Flexibility
 * Data citation methods should be sufficiently flexible to accommodate the variant practices among communities, but should not differ so much that they compromise interoperability of data citation practices across communities.

NIH Public Access Policy
NIH will explore ways to advance data as a legitimate form of scholarship through data citation and other means.

Getting new elements added to JATS itself

 * NISO Access and License Indicators (ALI), available in JATS 1.1d3

A superset extension of JATS

 * TaxPub
 * Catapano T. TaxPub: An Extension of the NLM/NCBI Journal Publishing DTD for Taxonomic Descriptions. In: Journal Article Tag Suite Conference (JATS-Con) Proceedings 2010 [Internet]. Bethesda (MD): National Center for Biotechnology Information (US); 2010. Available from: http://www.ncbi.nlm.nih.gov/books/NBK47081/
 * Penev L, Catapano T, Agosti D, et al. Implementation of TaxPub, an NLM DTD extension for domain-specific markup in taxonomy, from the experience of a biodiversity publisher. In: Journal Article Tag Suite Conference (JATS-Con) Proceedings 2012 [Internet]. Bethesda (MD): National Center for Biotechnology Information (US); 2012. Available from: http://www.ncbi.nlm.nih.gov/books/NBK100351/

Process

 * Survey of
 * existing citation infrastructure in JATS 1.0
 * data citation practices
 * Remote discussions via the Force11 Data Citation Implementation Working Group
 * One-day workshop in London in June 2014
 * Decision to go for extending JATS rather than a superset extension
 * Agreement reached on set of suggestions for new elements, attributes and attribute values
 * Submission of suggestions to JATS Standing Committee
 * Response from JATS Standing Committee
 * Incorporation into JATS 1.1d2
 * Recommendation by JATS Standing Committee to NISO: adopt JATS 1.1d3 as JATS 1.1


 * Similar to the existing JATS element, and the @version attribute for the < tex-math > element.

< data-title >

 * Analogous to the < article-title > in a normal citation.
 * could also be given, which would identify the data repository

The following example (which was added to the tag library) shows how 1 :140020 doi: 10.1038/sdata.2014.20 (2014 ). 

@assigning-authority

 * For elements < ext-link > and < pub-id >
 * @pub-id-type used to be used to specify the authority; now it should only be used to specify the type of identifier
 * For example, a DOI might be described with

Linking attributes for < pub-id >

 * Many identifiers are associated with URLs, so can be rendered as hyperlinks
 * Indeed, in the linked data world, many identifiers are HTTP URIs.
 * Therefore, the "might-link attributes" were added.

@publication-type

 * New value, "data", was added.
 * For “dataset, database, spreadsheet, et al."

@person-group-type

 * New value, "curator", was added.
 * Standing Committee has indicated that they will revisit this issue in light of the CRediT - Contributor Role Taxonomy, which has just been published

Example of the use of the "curator" value:

  Frankis Michael , curator. ", available from  http://eol.org/pages/1177542.  Accessed 30 Mar 2015. 

@pub-id-type

 * This attribute is used on < pub-id >
 * Added three new values:
 * accession - a unique identifier in many bioinformatics databases, for example, protein or DNA sequences
 * ark - Archival Resource Key
 * handle - a Handle identifier

The following example shows how the "accession" value might be used. Note that it is accompanied by an @assigning-authority, to make clear the provenance of the identifier.

<mixed-citation publication-type='data'> Heinz <given-names>D.W.</given-names> , Baase <given-names>W.A.</given-names> , et. al.   , accession <pub-id pub-id-type='accession' assigning-authority='pdb' xlink:href='http://www.rcsb.org/pdb/explore/explore.do?structureId=102l'>102l</pub-id>. <pub-id pub-id-type='doi' xlink:href='http://dx.doi.org/10.2210/pdb102l/pdb'>10.2210/pdb102l/pdb</pub-id> </mixed-citation>

Examples
For further examples, see our full paper.

Re-Quiz


Untagged citation:

"Müller, C et al. (2005): Audio record of a 'singing iceberg' from the Weddell Sea, Antarctica. doi:10.1594/PANGAEA.339110, Supplement to: Müller, Christian; Schlindwein, Vera; Eckstaller, Alfons; Miller, Heinz (2005): Singing Icebergs. Science, 310, 12, doi:10.1126/science.1117145"

Possible tagging solution:

<mixed-citation publication-type="data"> Müller <given-names>C</given-names> , et al.  (<year iso-8601-date="2005">2005 ): <data-title>Audio record of a 'singing iceberg' from the Weddell Sea, Antarctica.</data-title> <pub-id pub-id-type='doi' xlink:href='http://dx.doi.org/10.1594/PANGAEA.339110 >doi:10.1594/PANGAEA.339110</pub-id> </mixed-citation>

Outlook

 * JATS4R recommendations on data citation
 * Outreach into the community
 * Hopefully wide uptake
 * Possibly adjustments in response to feedback
 * Adding license information to references, be they classical citations or data citations

Contact

 * @EvoMRI
 * Wikipedia talk page
 * Wikipedia email