User:Bluerasberry/wikibase draft

From 26-28 September 2018 a group of ?? Wikimedia and database enthusiasts convened at the New York City Wikibase Summit to advance Wikibase as the wiki software platform for presenting data. Wikibase is to Wikidata as MediaWiki is to Wikipedia: it is the software of Wikimedia projects, and also it is free and open software for anyone to publish their own content hosted on their own server. Knowledge centers including universities, STEM institutes, GLAM organizations can use Wikibase to present their data in the style of Wikidata but on their on server.

The mood at the gathering ranged from curious interest in a new Wikimedia tool to revolutionary excitement over an initiative which could become the latest Wikimedia sister project since Wikivoyage in 2013. While Wikidata has used the Wikibase software since its inception, the new development is that now other external institutions are installing Wikibase instances to sort their own datasets.

Background
d:Wikidata:Glossary


 * https://www.jlis.it/article/view/12458
 * https://wikimediafoundation.org/2018/09/06/rhizome-wikibase/

There are two parts to the database - the client and the repository. The client is an extension which connects with the repository.

the native Wikibase format is not RDF, it is json.

A classical RDF statement would be to match properties with values. There is no way to match these statements to qualifiers, like a time range.

RDF standard is to not have references and qualifiers.

Wikibase is unusual for converting json blobs to RDF. Most other RDF graphs are not updated at the speed of Wikidata.

We let everyone vandalize our database. When vandalism happens in Wikipedia humans might read this but the text itself only sometimes gets republished. With Wikidata the likelihood of reuse is higher, so the impact of vandalism is higher.

...

From one perspective, it seems ambitious and challenging for the Wikimedia community to create and establish software in the global marketplace. The wiki community has little money, produces free and open software upon which any commercial company could build, and has to contend against the attention of best minds that the largest tech companies can purchase. From another perspective, attendees at this summit claimed that the wiki community has advantages in this space which could make it successful. Some attendees claimed that while relational database software is everywhere, Wikidata operates in a graph database model where software is lacking and users seek a free solution. Also Wikidata has been in operation since 2012, building years of use history, regular users, popular datasets, and the unique style of Wikimedia community feedback. It could happen, and summmit attendees wished, that this early Wiki-based development would establish Wikimedia community values into the practices of software development, dataset curation, and community participation in this newly developing software and technology sector.

Event activities
...

Data modeling
see notepad https://notepad.rhizome.org/LLaKyGDPTZKbB3r_wbnoAg?both

For any concept, name it, then list the properties which you would expect to be necessary to describe it.

Andra said, "There are people who model subways in San Francisco differently than other people model subways in Paris, and that is okay." "Making a successful property proposal is like winning a lottery. Sometimes it is up to chance. Making more proposals is like buying more lottery tickets".

Richard explained, "The Wikidata admins execute the creation of new Wikidata properties. The number of people who participate in the discussion for the property proposal matters, but right now, if the proposal is reasonable, and if a few people support it, then the admins tend to create the property."

Lozana Rossenova said, "People from various GLAM organizations need to meet each other to map the properties of their institution's own databases to Wikidata properties."

Dene of that university in Vancouver commented "There is no currently an ethics statement in Wikidata about knowledge sharing. The most important thing from most contributors is making data open, but we also face the challenge of having corporations ingest the data we make open and use it in dastardly ways.

Infrastructure and tools
Tom pointed to the day 2 notes https://notepad.rhizome.org/MPTVwHVZRSSIGAvZ6cKopg?both

The Wikibase Registry is the system by means of which anyone can note that a Wikibase instance exists. Currently any human can register any Wikibase instance. The system is not opt-in, and anyone who demonstrates the existence of a Wikibase instance can register one. The nature of the registry itself favors listing public Wikibase instances but the participants in the registry are developing norms about when to register a closed Wikibase instance.

Yurik shared the open street map wikibase and added it to the Wikibase registry

Tom talked about constraints in Wikidata. For example, if an item for a person indicates that they are a child of a parent, then the parent item should indicate its property of being parent of a child.

This summit track encouraged participants to create Phabricator tickets. 

Tom talked about how the current easy way to get information into Wikidata is through QuickStatements. The QuickStatements extension is not currently a native part of Wikibase and anyone who wants that functionality would have to install it. Some people questioned the appropriateness of the fit of QuickStatements to Wikibase, saying that while QuickStatements works, it would be nice to have a total redesign of the infrastructure of the tool and process. Also it would be nice to have unlimited labor and resources to redesign and reimplement everything Wikidata, Wikibase, and Wiki.

User experience
some use cases researcher wants to access records from existing records context - e.g. speciments discovered by women scientists

example users internal and external GLAM collection users internal - curator and archivist external - digital humanities researchers, academics, possibly wider public as well

One proposed change - doing more clustering of Wikidata properties. Currently the only clusters are identifiers and everything else.

Lozana said that the consensus of discussions was that Wikibase users would want easier ways to migrate collections of information from one Wikibase instance to another. She demonstrated a prototype tool, "Roundtripping", which would for share information across Wikibase instances and assist with mapping one property in one Wikibase instance to properties in another. This could include exporting Wikidata information to institutions for them to remix with their own information, institutions importing their information into Wikidata, and various institutions exchanging data among themselves. Lozana described an "item compare" feature, in which a user could compare an item in one Wikibase instance with an item in another. The same item might have different properties or values in different Wikibase instances, or one institution might get inspiration for modeling its own items based on what another institution is doing.

Community
Wikibase is developing its own community

https://meta.wikimedia.org/wiki/Wikibase_Community_User_Group

no posts to the mailing list since July no developed presence on Meta usergroup has low participation should Wikibase group meet at WikidataCon or should Wikibase have its own event?

Jens asked, "Is Wikibase one of the Wikimedia sister projects? We are not currently listed on the wheel of logos."

Credits
Thanks to speakers, and thanks also to the other participants listed below.


 * Wikimedia New York City
 * Megan Wacha, City University of New York
 * Richard Knipel, The Metropolitan Museum of Art
 * METRO
 * Karen Hwang
 * Wiki Project Med
 * Lane Rasberry (user:bluerasberry), University of Virginia
 * Rhizome
 * Dragan Espenschied
 * Zachary Kaplan
 * Wikimedia Foundation
 * Sandra Fauconnier
 * Amanda Bittaker
 * Ben Vershbow
 * Alex Stinson
 * Wikimedia Deutschland
 * Andra Waagmeester
 * Thomas Arrow
 * Sandra Müllrik - ask about interviews
 * Lozana Rossenova
 * Jens Ohlig
 * Jens Ohlig