User:Spinster/A Wikimedians' guide to good collection websites

Wikimedia volunteers - or Wikimedians - are expert users of collection websites of GLAM s (Galleries, Libraries, Archives, Museums). We intensively use collection websites to find and verify information about art, history and culture. We often link to collection websites, in all relevant Wikimedia projects, most importantly in Wikipedia, but also in Wikimedia Commons, Wikidata, Wikisource and more.

Wikimedians tend to use information from collection websites on a larger scale, too, for instance when mass uploading media to Wikimedia Commons or when adding whole datasets to Wikidata.

Over the years, we have found that we really like it if collection websites have the following characteristics. These will usually help making a collection website more findable, discoverable and re-usable by humans and machines - and, as a pleasant side effect, probably more in line with your country's (upcoming) open data legislation, too.

Publish as much as you can.
Even better: publish everything. As soon as possible.

Collection databases are messy and many of your objects have not yet been approved by your curators. This is the case at every GLAM - it happens to the best, including MoMA. Research has clearly proven that end users - especially researchers - do want to see everything. Give us everything. Do you have records that have not been checked yet? Show them as soon as you can, and simply tell us that they still need to be looked at. Who knows, a Wikimedian might pass by, do some of that checking for you, and inform you about it!

Keep it simple.
Oh yes, Wikimedians also love and admire great visualisations, interactive features of websites, and beautiful design. We don't want culture to appear in a boring environment!

But we appreciate (and use!) it most, when all information on a collection website
 * can be found by us via search engines (so it's not hidden in the deep web)
 * is available immediately - we may for instance fail to find your permalinks if they're two clicks behind the image we see first...
 * is written in open web formats and not obfuscated by Javascript or unnecessary frivolities that make it harder for humans or bots to process your information. Yes, looking at you accusingly, infinite scrolling!

Metadata is good. Metadata is not dirty.
Images of artworks deserve to be admired, preferably full screen. But the information behind them is extremely valuable too. Don't hide your metadata in a second screen. We want to see your credit lines, attribution information, your inventory numbers and acquisition history. Immediately. It helps us write better articles and describe your collections better, from our side.

Give us indexes.
When visiting a collection website, we often don't know very well what you hold. What can we look for? Do you have a great photo collection of your town in the 1940s? Funky surrealist sculptures? Interesting Roman coins? Highlights may help us a bit, but maybe - as we are curious Wikipedians - we are looking for that one interesting, little-known item in your collection. There's nothing wrong with good old indexes - those boring, alphabetical lists of things, people, places and times that are relevant to your collection. They give us the overview we want! And a good designer can make them look nice.

Let us search and filter in crazy and boring ways.
Wikimedians are often looking for very specific things. They are made happy with advanced search options, with faceted search and the option to go berserk with filters. We often like to filter content on quite strange criteria (copyright of images? year of acquisition? Hell yeah!) and we are very happy if this feature is not dumbed down for us!

Lots of great text? Make it available under a free license.
All content on Wikimedia projects is available under free (Creative Commons) licenses. We do this as a community because we want everyone to be able to re-use the information created by us as freely as possible - yes, also for commercial purposes.

We notice that GLAMs often write and publish excellent texts about their collections and their area of expertise. And we often hear that GLAMs wouldn't mind us re-using (part of) those texts on Wikipedia. If your texts are copyrighted, though, we cannot re-use them: doing that would be a copyright violation, because we would transfer your copyrighted text to an enviroment that is entirely licensed under free Creative Commons licenses.

In order to make it possible for us to re-use your texts, we recommend that you release them under Creative Commons licenses yourself - more specifically licenses that are compatible with Wikimedia projects. That's CC0, public domain, CC-BY and CC-BY-SA.

We advise against the use of the non-commercial clause in Creative Commons licenses. A good reasoning for this can be found in this brochure.

PDF or HTML?
This one should be obvious. If you are able to publish something in HTML rather than in PDFs, please please do.

Give us permalinks.
All websites eventually become obsolete. And sometimes it's inevitable that - even though it's not good practice - your URLs will change.

In order to be able to deal with that, every collection website nowadays should have permalinks.

Please make them visible immediately, without us having to do an extra click to find/show them. They are not dirty. They are necessary and show that you care.

Give us unique identifiers.
Especially when we add external cultural data to Wikidata, it is very handy and useful if each piece of that data is identified with a unique number or code. In that way, we can easily point the information on Wikidata to the exact correct piece of data in the source website, for instance with a specific property. We are very happy if your website contains those unique identifiers and if they are visible in your web pages. It is not mandatory but extra handy if your unique identifiers are even part of your permalinks!

Provide correct copyright information for all your images, on a per-image basis.
It is safe to include general copyright disclaimers on your collection website, but usually copyright status is different per image. Some images are copyrighted (for instance: images of recent artworks), some may be released under Creative Commons licenses (example: photographs of events or older three-dimensional objects made by your staff), some are public domain (faithful reproductions of two-dimensional artwork that is in the public domain itself).

For end users, and for 'linkers' like us Wikimedians, it is most helpful, precise and correct if every image has separate copyright information.

Free licenses and public domain
Do you have many images (like photographs) made by your own staff? Make them available under Wikimedia-compatible free licenses.

https://commons.wikimedia.org/wiki/Commons:Choosing_a_license

Do you have many images of public domain two-dimensional artworks? Make these available as public domain as well.

Highest resolution
argumentation for highest res. Example Rijks

Comment: With very high resolution you can go very close to a painting and see a lot of details.

Tell us who made the image itself and (if relevant) who made the thing that is depicted in the image.
Sometimes you don't know. That's fine too - just tell that on your website. No shame in that. Much better than no info about authorship of any image at all.

To API or not to API?
For developers/programmers, it is very handy and useful if your website provides an API. (any tips on how a good api should be designed?)

But for regular end users, Wikipedians, volunteers on Wikidata, such an API is not a must. Actually, we find it very helpful if the data from your website is made available in a very simple way: as comma- or tab-separated text files (csv, tsv). MoMA, for instance, does this on GitHub - easy to download, makes us very happy. The result is that we now have a considerable portion of the MoMA collection on Wikidata, too.

CC0
CC0 for data