Wikipedia talk:WikiProject Tabular Data

Wikidata?
Has something like a "Wikidata" site under the Wikimedia umbrella been considered? It could be a central, cross-language repository of data like that which this WikiProject intends to deal with. Then all that is needed is some way of transcluding the data; from there, writing templates in native names to format the data should be relatively simple. --Cyber cobra (talk) 21:35, 28 September 2009 (UTC)
 * Yes, mw:wikidata has been considered, but it never became much more than a collection of ideas. --Septembermorgen (talk) 20:45, 29 September 2009 (UTC)
 * Definitely a more long-term idea, but this is a great short-term idea. Now, we just need to get it off the ground. I suppose we should start with something simple, like populations. I'm still a but confused on how this works. Let's take populations by country. I am thinking something like   which would return 1,333,240,000. That's a simple switch template. But how do we handle sourcing and maybe when it was last updated? I wish I understood German so I could see it in action.↔NMajdan &bull;talk  21:23, 29 September 2009 (UTC)
 * Why not have the citation always be included along with the value? That just leaves the date, and we have one parameter to the template that switches between the data and its last-updated date. --Cyber cobra (talk) 23:53, 29 September 2009 (UTC)


 * See http://meta.wikimedia.org/wiki/Wikidata.
 * —Wavelength (talk) 19:09, 5 April 2012 (UTC)

Naming scheme
To prevent us from having a bunch of discussions in one thread, allow me to break it out. Should we decide on a naming scheme? What all categories of "metadata" can we think of? We probably need to determine this before we start a mass creation of metadata templates.↔NMajdan &bull;talk 00:01, 30 September 2009 (UTC)
 * Regarding categories of metadata, pretty much most things in Infobox Country and most data used in WP:Persondata. --Cyber cobra (talk) 00:40, 30 September 2009 (UTC)
 * There is more than that. Infobox U.S. state, for one.↔NMajdan &bull;talk 00:47, 30 September 2009 (UTC)
 * Well, yes, much more surely, but this would probably have to be done in phases anyway. --<b style="color:#3773A5;">Cyber</b> cobra (talk) 18:43, 1 October 2009 (UTC)
 * What is a reasonable number of switches in a template? Meaning, should we create one template for all of a country's cities, like  or should it be broken down by state/province/etc, like  ?—<span style="font:bold 11px Verdana,sans-serif;">NMajdan &bull;<span style="font:9px Verdana,sans-serif; color:#000;">talk  18:59, 1 October 2009 (UTC)
 * We could do something like <tt> </tt>, which behind the scenes switches on the first parameter and passes the rest thru to <tt>  </tt>, which in turn delegates to <tt>  </tt>. This would have the benefit of providing a unified public-facing interface while keeping each individual template fairly manageable. But either way, at least province-level would probably be necessary for manageability (except perhaps for exceptionally small countries). Regarding naming, ISO has standards on country name abbreviations which would probably be good to follow (*goes to look them up*) -- ISO 3166-1 (probably ISO 3166-1 alpha-3 rather than alpha-2 for less ambiguity?); and ISO 3166-2 for provinces. --<b style="color:#3773A5;">Cyber</b> cobra  (talk) 22:45, 1 October 2009 (UTC)
 * Can this be done with multiple subpages on one template similar to WPBannerMeta? I threw together an Oklahoma template in my userspace as a quick test. I got the Census data from here.
 * Using <tt> 0 </tt> returns 0 and <tt> 0 </tt> returns 0—<span style="font:bold 11px Verdana,sans-serif;">NMajdan &bull;<span style="font:9px Verdana,sans-serif; color:#000;">talk 01:13, 2 October 2009 (UTC)
 * Now I need to get it to work by passing parameters through to Meta-Population.—<span style="font:bold 11px Verdana,sans-serif;">NMajdan &bull;<span style="font:9px Verdana,sans-serif; color:#000;">talk 12:07, 2 October 2009 (UTC)
 * OK. Giving <tt> 0 </tt> now returns 0. Definite progress. Thoughts?—<span style="font:bold 11px Verdana,sans-serif;">NMajdan &bull;<span style="font:9px Verdana,sans-serif; color:#000;">talk 12:25, 2 October 2009 (UTC)
 * That looks quite awesome, though is there any way to avoid the <tt>formatnum</tt> for end-user-editors? --<b style="color:#3773A5;">Cyber</b> cobra (talk) 22:26, 2 October 2009 (UTC)
 * Including the comma in the data would, but it would probably be more difficult to go through and add those. I think leaving as-is is the simplest method.—<span style="font:bold 11px Verdana,sans-serif;">NMajdan &bull;<span style="font:9px Verdana,sans-serif; color:#000;">talk 23:32, 2 October 2009 (UTC)
 * Hm, pity; oh well. Now we just need to figure out referencing. Perhaps passing "<tt>ref</tt>" as the initial "country" param sends it down another parallel hierarchy of templates containing just refs? We could do it in the same set of templates, but I don't know how hairy that'd get. Or we could make another top-level template with a <tt>ref</tt> suffix. --<b style="color:#3773A5;">Cyber</b> cobra (talk) 00:17, 3 October 2009 (UTC)
 * Thousands separators should not be included, because it won't be possible to calculate e.g. population densities and it would be more difficult to internationalize these templates. Therefore in dewiki formatting templates are used, which return the notation wished. Using de:Vorlage:EWZ for example  returns the population of Reykjavík in the notation 117.721 (Here's an example:  where this template is used). It isn't difficult to write a template which instead returns the notation 117,721. --Septembermorgen (talk) 13:38, 3 October 2009 (UTC)
 * There might occur problems, when names of a places are used as a data-keys like in the Oklahoma-template, because the template won't work properly when there are two places of the same name. Won't be the Federal Information Processing Standard an appropriate key for the populated places in the USA? It could look like this template: for the Belgian political subdivisions. It also includes parameters for the source (QUELLE) and date (STAND) of the dataset. --Septembermorgen (talk) 21:36, 3 October 2009 (UTC)
 * I just created Template:PopNum.  returns 0 . What do you think of renaming the project "WikiProject Data-templates" or "WikiProject Data"? --Septembermorgen (talk) 21:02, 5 October 2009 (UTC)
 * That template calls another template Data Population BEL which appears to only have data from Belgium (its still in a different language, so I can't verify). I think we were hoping to combine all of these into one template for ease of maintenance. Also, what I like about having the names in the template, is that I can pretty much take the Excel files straight from the Census, do a tiny bit of parsing and create the data from the template. Greatly increasing the speed I can update the templates as well as the accuracy. But I can foresee issues when we step back from cities/towns and do counties. That may have to be a different sub-template since there are many cases of cities and counties having the same name. These are all issues we need to work through and it would be nice it we could come up with a consensus before we start mass producing these templates.—<span style="font:bold 11px Verdana,sans-serif;">NMajdan &bull;<span style="font:9px Verdana,sans-serif; color:#000;">talk 21:14, 5 October 2009 (UTC)
 * It might be best to go numerical for US cities (to account for collisions, and convenience of template creation); we could use the Geographic Names Information System feature IDs. For US counties, it seems we're going to have to stick with names as the FIPS county code spec has been withdrawn. I definitely think we're going to have to separate the levels of administrative divisions for manageability. Perhaps 4 end-user templates/divisions: <tt> </tt>, <tt>  </tt> (as the term differs from country to country, e.g. "state" vs. "province"), <tt>  </tt> (again, the term and number of intermediary levels widely varies between countries here), and <tt>  </tt> (i.e. Winterset, Iowa's GNIS feature ID#; other countries will use other IDs or methods). Thoughts? --<b style="color:#3773A5;">Cyber</b> cobra  (talk) 22:21, 5 October 2009 (UTC)

Project Name
I think "metadata" is not the best way to describe the data the wikiproject is trying to organize; it seems weird to call the population of a country metadata rather than just data. I think "WikiProject Tabular Data" or "WikiProject Statistical Data" might be a better name (or even WikiProject Data?). Thoughts? --<b style="color:#3773A5;">Cyber</b> cobra (talk) 22:23, 2 October 2009 (UTC)
 * I actually agree. Metadata is data about data. Population data, which as it stands now is all this project encompasses, is not metadata. I'm not too picky about what we call it, but I do agree calling it Metadata is inaccurate. Would that also require us to change the name of the one template we have created so far? Probably need to decide that before we begin mass population of the data.—<span style="font:bold 11px Verdana,sans-serif;">NMajdan &bull;<span style="font:9px Verdana,sans-serif; color:#000;">talk 23:34, 2 October 2009 (UTC)

Advantages from internationalization
Data don't differ between languages, therefore it might be very easy to transfer them from one Wiki to another (so update of data would be easy managable and less time consuming, than updating the templates in each Wiki by its own). However it would be necessary to allocate data to the same key (in each Wiki) and to have the same template layout. --Septembermorgen (talk) 14:11, 3 October 2009 (UTC)

Comment on the WikiProject X proposal
Hello there! As you may already know, most WikiProjects here on Wikipedia struggle to stay active after they've been founded. I believe there is a lot of potential for WikiProjects to facilitate collaboration across subject areas, so I have submitted a grant proposal with the Wikimedia Foundation for the "WikiProject X" project. WikiProject X will study what makes WikiProjects succeed in retaining editors and then design a prototype WikiProject system that will recruit contributors to WikiProjects and help them run effectively. Please review the proposal here and leave feedback. If you have any questions, you can ask on the proposal page or leave a message on my talk page. Thank you for your time! (Also, sorry about the posting mistake earlier. If someone already moved my message to the talk page, feel free to remove this posting.) Harej (talk) 22:48, 1 October 2014 (UTC)

WikiProject X is live!


Hello everyone!

You may have received a message from me earlier asking you to comment on my WikiProject X proposal. The good news is that WikiProject X is now live! In our first phase, we are focusing on research. At this time, we are looking for people to share their experiences with WikiProjects: good, bad, or neutral. We are also looking for WikiProjects that may be interested in trying out new tools and layouts that will make participating easier and projects easier to maintain. If you or your WikiProject are interested, check us out! Note that this is an opt-in program; no WikiProject will be required to change anything against its wishes. Please let me know if you have any questions. Thank you!

Note: To receive additional notifications about WikiProject X on this talk page, please add this page to WikiProject X/Newsletter. Otherwise, this will be the last notification sent about WikiProject X.

Harej (talk) 16:57, 14 January 2015 (UTC)