Wikipedia:Meetup/Cost MOBILISE Wikidata Workshop/Parallel sessions

= Introduction =

This page can be used for brainstorming and planning of the workshop

= The Big Picture =

Deb, Anton, Eva, Pawel, Thierry, Elspeth

Potential topics:
 * Data types: What kind of biodiversity-related data is suitable for Wikidata?
 * Scope: For which of these data would Wikidata only be used as an additional tool for data publication and linking and for which data types could local data management be abandoned? (for example person data, descriptions of localities, names, etc.)
 * Example: specimen data. people data.
 * Stakeholders: Do we need any coordination of such activities? Standards? TDWG?
 * Architecture: do we need to create a separate Wikidata instance to be able to create the properties we need? Should we avoid too many instances because we would actually get the distributed data management again?
 * Capacity: who would contribute this data? Can our collection / data mobilizers who may have this data get it out of their databases? How often is it in the black-hole of a free-text field?
 * Stakeholders: how would we engage others (wikidatans, ...) to contribute?
 * Authority: how would we engage the collections community in this standard-of-practice change?
 * Risk: Depending on the public is a risk. Do we risk issues with perception of quality of the data? What else? What do we know about how we might mitigate these risks?
 * No action plan: What if we don’t move forward with Wikidata engagement? What are known / perceived benefits we would miss out on? What are activities we cannot do, without structured Wikidata?
 * Bias in Wikidata: How much of a problem is this? What can be done to address it?
 * Gender
 * Geographic
 * Taxonomic
 * Automated workflows from scholarly taxonomic publications to continuously update Wikipedia with new research results: How can wikidata leverage the power of text and data mining?

= Taxonomy =

Andra, Dominik, Jerry, Quentin, Simone, Markus

A few ideas for discussion (added by Jerry, Feb 12th) - please add extra thoughts - What is current status of taxonomies in WD, in summary?

- Is it worth extracting some more detailed stats?

- What could WD add for taxonomies beyond say CoL?

- Who governs how existing taxonomies like CoL are represented in WD?
 * Who maintains e.g. CoL identifiers?

- What would improve taxonomy data in WD?
 * Synonyms? Taxon nomenclatural status? taxon author?
 * Is there a minimum viable product for a taxon in WD?

- How could such improvements be implemented?
 * What could be automated?
 * Are there elements that could only be done by manual curation?

= People and Literature =

Anna, Mathias, Judith, Laurence, Nicky, Paul, Steve

https://docs.google.com/document/d/1xcX1LVm5txTmltRsdkiwtd_HeyL-dcrLpeyn4Ppnrng/edit

= Pipelines, standards, architecture, infrastructure =

David, Donat, Maarten, Steve, Guido


 * How are we going to integrate Wikidata into our workflows?
 * Are we just going to be reading Wikidata or updating too?
 * Do we need to be careful of using Wikidata, or getting too reliant on it?
 * Should we be considering multiple instances of Wikibase for some sorts of data?
 * Is Wikidata going to be performant enough to support the applications we have in mind?
 * Are there data types we would want in Wikidata, but are not there yet (or are perhaps too messy).