User talk:Graeme Arnott/sandbox/conference/2

I've set out the various elements of the project that we need to consider. I've put these together from both our conference call on 27-02, and the meeting with the Scottish Government today, 28th February. It may be necessary to create sections on this page as the conversation develops.


 * Where the links are hosted (the landing page). From our conference call discussion, Wikilabs seemed the more likely place
 * How the links get onto the landing page. I've placed Ewan's approaches in this portion

We can envisage two architectures for associating data resources with a Wikipedia data portal page.

The first approach is already implemented in the current sandbox page. URLs embedded in the portal page simply point to external data resources. The latter can either be raw data representations (e.g., data formatted as CSV, JSON, XML etc) or processed into a more human readable form, such as some kind of table-based representation or a map-based representation.

The second approach is more complex to envisage, but assumes that the external data portal returns only raw data representations, and the presentation of these is the responsibility of the Wikipedia portal. For example, a handler would take data in some restricted set of formats, and do its best to represent these in an appropriate human-readable form. Ewan klein (talk) 23:00, 27 February 2014 (UTC)


 * How the links are organised on the (Wikilabs?) page
 * Whether the (Wikilabs?) page just hosts the links or whether that page, or the associated Wikipedia article. hosts visualizations of the data. It might be that visualizations on the Wikipedia page would help popularize open data.Graeme 21:04, 28 February 2014 (UTC)

Other Areas of Consideration
These aren't priorities but I can imagine them being the sort of questions that arise in discussion with data providers. I thought that it might be worthwhile considering them in anticipation of event at Victoria Quay in March.

The first is one of scale. Normally scalability is used to refer to the expansion of a concept or way of working but here I think we need to consider how the OpenData Portal would work with a data providing, administrative region that doesn't contain a city.

There is a tidy correlation between Glasgow City Council as a data provider, Glasgow as a geo-political space, and Glasgow as the subject of a Wikipedia article. Data released by the local authority (Glasgow City Council) correlates with the idea of Glasgow as a geo-political space worthy of a Wikipedia article, and therefore with the proposed OpenData page with its links to data about Glasgow. This correlation becomes less clear when either the data provider (a local authority such as Clackmannanshire) doesn't contain a city but contains a number of towns or the city is contained within a larger regional administrative construct. It might be also worth noting at this point that in the Seven Cities Alliance, only four of the councils (Glasgow, Edinburgh, Dundee and Aberdeen) correlate with the geo-political space of the same name. In other words, it's easy to imagine Glasgow as a Future City because the city and the region are synonymous; there is no gap between the map and the territory.

The first issue for consideration is where the link to the OpenData portal would reside. This is straightforward in the case of Glasgow but not so when one considers Clackmannanshire. One solution might be to simply replicate the OpenData portal's link on each of the Wikipedia articles for Clackmannanshire's towns and villages as well as on the Clackmannanshire page.

Our discussions so far have considered the relevance or usefulness of the links on the Wp Opendata Portal: would the page contain links to data specific to Glasgow or something larger from which Glasgow specific data had to be extracted? However, is it realistic to expect Clackmannanshire Council to present their data in such a way that it is relevant to each town and village within its boundary. I suspect not. Is it realistic to expect RCHAMS to arrange their released data in URLs specifically related to the towns and villages of Clackmannanshire. Again, I suspect not. The reason I suspect not is because Glasgow claims the name of city whilst issuing data as an administrative region.Graeme 10:18, 1 March 2014 (UTC) — Preceding unsigned comment added by Graeme Arnott (talk • contribs)

Feedback from Peter Murray-Rust
Notes based on conversation with Peter Murray-Rust, 28 March 2014.

Peter argued that the approach probably can't be generalised outside cases like cities, for reasons similar to those discussed by Graeme above. That is, we should only try to do it when it's fairly easy to demarcate some key datasets. For cities, the key datasets might be similar to, and possibly based on, the OKF City Open Data Census.

It would be a nice illustration of the approach to decide on a standard set of data for as many as possible of the Scottish Seven Cities and for each such city build illustrative pages along the lines of the existing Glasgow page.

--Ewan klein (talk) 20:21, 28 March 2014 (UTC)

Details
The general case is as follows: for a given dataset D, a link from this page should take us to a target page P which displays the results of sending an appropriate query to D. Ideally, there should be some way for the data holder to automatically generate P, given an appropriate page template and the topic (e.g., Glasgow) of the originating Wikipedia page. This can be regarded as an arbitrary view of the underlying data, limited only by the expressiveness of the query language. The way that this differs from the examples above is the following. RSPB has already provided a way of searching for specific birds, but we cannot expect every data holder to have pre-canned queries corresponding to the topic of every Wikipedia open data page. (For example, RSPB may have geocoded the distribution of birds, but probably doesn't have a way of displaying the results of a query such as: all birds whose summer distribution is within 10 miles of the centre of Glasgow. So a scalable solution would need a more automated way of pulling data from the dataset and giving it a default display. It would be helpful if we could mock up one or more such pages as an illustration for the case of a dataset which is currently just distributed as raw data. — Preceding unsigned comment added by Centralamerica (talk • contribs) 12:04, 23 March 2014 (UTC) == Outdated Duplicate links --

--Centralamerica (talk) 12:31, 23 March 2014 (UTC)