Wikipedia talk:Reliability of open government data

COI disclaimer
Just a warning that I have a COI in relation to one of the papers prominently discussed in this essay. However, the results of the paper are fully reproducible from the source data + code, independent of me, and the peer reviews of the paper are public (and the official peer-reviewed version of the paper is CC-BY). Moreover, the essay is not a Wikipedia article. Feel free to edit. Boud (talk) 23:32, 16 September 2021 (UTC)

Government sources as COIs?

 * See also fr:Discussion_aide:Identifier_des_sources_fiables

I got into a similar discussionn on the French wikipedia following Turkey's numbers being proven by BBC's fact checkers as being phony ('700 border alerts' were présented as '700 border attacks'). The source of the statement was authoritative, but deeply involved and with COI in the matter. French fr:WP:SPR states :
 * "Wikipédia privilégie les sources indépendantes du sujet, c'est-à-dire qui sont issues d'auteurs dont la ligne éditoriale n'est pas directement contrôlée ou en interférence avec le sujet, et qui n'ont pas de conflit d'intérêts avec celui-ci".
 * ""Wikipedia favors sources that are independent of the subject, that is to say that come from authors whose editorial line is not directly controlled or interferes with the subject, and who have no conflict of interest. with this one".

I don't know if a similar statement is in Citing_sources. Yug (talk)  🐲 15:35, 22 January 2022 (UTC)


 * (I took the liberty of adding a section title, please fix it if needed.) I think the issue of a government being in COI with information its agencies publish about elections is generally obvious, and the more authoritarian the government is (and the less independent the electoral commission is), the more obvious the COI is. With COVID pandemic statistics or other health ministry/agency statistics, the COI is less obvious. Depending on the particular country and administrative and political history, state agencies can be anywhere from highly independent from, to highly controlled by, their governments. Democratic politics is all about checks and balances and independent controls.To get to practical proposals, my feeling is that COI assessments per government agency would tend to anti-correlate with reliability. The WP:RSP discussions are mostly qualitative (it seems to me), but would effectively be what would be needed, though numerical open data would tend to allow quantitative probability assessments of credibility per agency, as stated currently in the essay.The difficulty in terms of Wikipedia editing is that people are strongly motivated to edit by media coverage - and government agencies are close to the unique source for major statistics. This is one of the propaganda model filters. So COI gets forgotten/overridden because editors prefer data, even if it's COI-ed, to an absence of data.As for Wikipedia policy pages, WP:RS is more relevant here than WP:Citing sources (which is mostly about the technical aspects). Elsewhere, there are some pages that recommend first trying to develop the practice of consensus, and not trying to write new guidelines until the practice itself has developed. This sounds reasonable to me. Boud (talk) 04:14, 4 February 2022 (UTC)
 * [Thanks for adding a title, is clearer this way]
 * The structural COI is obvious, still we have [nationalist] editors who simply deny any COI and happily fall back on the "this is a governmental source therefor authoritative and reliable". When another editor point out possible COI it can easily spiral into an edit war with each side branding the other as biased and no tool to assess the validity of each position. The structural COI must be explicit in this guideline as a well known issue to account for, so both side can agree that data is not absolute, it should be cited with caution. (It's the purpose of your whole guideline). I find the French statement cited above complete and elegant, so I submitted it to your consideration.
 * I fully agree with your general assessment, governmental data can be used to fill up the void in a similar way as politicians have cited selected polls from partisan sources to argue they do work in the best interest of the whole population. And we did see statistically surprising numbers displayed as victories by various countries. Yug (talk)  🐲 10:57, 4 February 2022 (UTC)
 * The fr.Wikipedia statement is fine, but en.Wikipedia already has plenty of strong statements that, if applied literally, could be used to remove all COVID-19 data from en.Wikipedia (I would oppose that, and I think an overwhelming majority would oppose that). For example, the primary vs secondary vs tertiary source distinction. Government agencies publishing COVID-19 infection counts or death counts seem to me to be primary sources, but WP:PRIMARY says 1. ... that have been reputably published may be used in Wikipedia, but only with care, because it is easy to misuse them. Have all the national health agencies (in some cases, sub-national) been checked to be reliable publishers? The words reputably published link back to WP:RS. The closest thing to the topic is WP:RS/MC, but a governmental health agency is not a peer-reviewed systematic review article. 3. A primary source may be used on Wikipedia only to make straightforward, descriptive statements of facts that can be verified by any educated person with access to the primary source but without further, specialized knowledge. This could hypothetically be valid if government health agencies published their full PCR test individual reports, full of private details that are certainly illegal to publish in the EU, with full lists of the medical labs which published all their reports with successive numbers so that someone checking would know if any were missing; but even in this unrealistic hypothetical case, an educated person would have to do a huge amount of work to add up the numbers. So 1 and 3 could quite likely, if applied bureaucratically, be used to remove all the COVID-19 data from en.Wikipedia on the grounds that the primary source is not acceptable for this purpose. The JHU CSSE collection of the data might count as WP:SECONDARY, since it is an extra step removed from the data collection, but it includes the dubious official data along with the more credible official data, so it's not any more reliable in practice than the Wikipedia data collected directly from the government agencies.So my point here is that guidelines or a policy for how to handle open govt data will have to be evolved from discussion and consensus in particular cases. But I agree that a useful starting point is something like official is not necessarily reliable. Your 'in a nutshell' summary is useful. :) Boud (talk) 03:39, 7 February 2022 (UTC)

Integrating/responding to Intralexical's comments
See Reliability of open government data/Intralexical's Response. Thanks for your comments :). Here are point-by-point responses.


 * ... where controversy or shortcomings in accuracy may exist ... coverage about its reliability (such as in a dedicated, prominent section) should also be included where it is most relevant.
 * I agree. I think this is already essentially standard practice, and is probably already mentioned somewhere in a guideline. But the point of the essay is that that is unlikely to be enough.


 * That means that a noticeboard for "official sources" would probably serve to ascertain reliability, which may be redundant with existing mechanisms for assessing the reliability of sources.
 * In a formal sense I agree that these would be redundant, but (1) government agencies are quite different sources to news sources, which tend to be those most commonly discussed at WP:RS/N; and (2) as per the title of the essay, "data", especially numerical data, are a different type of information to qualitative (word-based) descriptions of events. (1) The nature of governments (scale, hierarchy, politics, civil service culture and inertia), where we are including governments from across the world, are typically different from those of news organisations; (2) numerical data leads to methods of checking that can in some ways be more powerful and/or more neutral than those for qualitative data. There's also the issue of not overburdening WP:RS/N, and of making it clear that the new noticeboard would have a qualitatively quite different task.


 * Depending on a likely insular group of people with rather specialized skills risks undermining that.
 * The sources currently cited in the essay for judging the credibility of the data all aim to be fully transparent and : Any results should be documented by making all data and code available in such a way that the computations can be executed again, yielding identical results, by any independent researcher with basic scientific computing skills. So, in principle, these sorts of analyses are fully open and checkable. In practice, the number of people who actually do the checking of the supposedly reproducible papers is, at least initially, going to be small. I agree that getting a sufficient variety of people either publishing different research papers or checking the reproducibility is not going to grow fast: this is a real risk. I added open access and reproducibility as proposed requirements in the essay, so that the credibility assessments should at least be easy to do for people with the basic scientific computational abilities + the time + the motivation.


 * Are they to be synthesized using the data from multiple peer-reviewed sources, or do they just report assessments made by the peer-reviewed sources?
 * Excellent point. I've done an edit describing the problem and a possible solution. In a broad sense, this is similar to what we do in Wikipedia and at WP:RS/N and in life in general - we have informal trust metrics, except that this is trust in reliability of analysis/information, not trust for actions someone might take. This really seems to me like a Bayesian issue. There's no absolute certainty or truth, but there are probability estimates that are built up over time. (E.g. new editors and IPs have a lower de facto Wikipedia trust level for articles that have a high rate of vandalism or sensitivity such as for BLP.) Would this be a WP:OR/WP:SYNTH risk? In some sense, we already have that for numbers in infoboxes - for armed conflict articles we generally put ranges with sources for the number killed, numbers of soldiers, or sometimes lower limits or higher limits to be conservative, depending on the consensus of the editors of the article. If we quantify our trust in researchers and or research papers, then the OR/SYNTH risk would increase, and we'd have to decide if this is an acceptable exception - given that it's at the meta-level, not at the level that directly affects Wikipedia article content.


 * Or it may just cover up a flawed process with the appearance of objectivity.
 * This is definitely a risk, I agree. But this also gets back to the focus on numerical data. Numbers tend to be perceived as more objective than qualitative information (obviously, in some ways they are, but only if they're obtained accurately). This both allows a different sort of reliability assessment, and tends to suggest that a different sort of assessment is needed.


 * The Autocratic Republic has confirmed that 1,234 people
 * I would propose "stated" rather than "confirmed". But I think we need something stronger, more explicit. I added a templates subsection in the Terminology: reliable vs official section. It would have to be neutral enough so that it can be used equally for "reliable" govt sources and "unreliable" govt sources.

Overall, I tend to agree that there could be a problem handling this issue due to (at least initially) having too few people with sufficient basic computational skills to double-check the peer-viewed open-access, reproducible research (and the approaches to reproducibility vary), with the risk that decision-making power on "knowledge" ends up de facto concentrated into too few hands. On the other hand, the amount of open government data is going to keep increasing, and the pressure to include it in Wikipedia will keep increasing. And things such as the domination of the infobox in the mobile view of Wikipedia articles, and the automatic feeding through of Wikipedia infobox data to search engines, and the fact that we have many nice graphs (for the COVID-19 pandemic) result in the prose discussion of reliability having no effect at all in giving any nuance to these forms of information distribution. Splitting off official data reliability as a different type of reliability assessment to the current WP:RS/N still seems preferable to me (provided that people are actually willing to do it).

Boud (talk) 16:51, 8 April 2022 (UTC)

I have the feeling that there are some things such as an official data template that could be started immediately by someone who likes writing templates, and tentatively experimented with to see how the community responds. See Reliability of open government data.

Also, I guess that someone could also take the government health agencies named in the peer-reviewed, open-access, reproducible (to varying levels) papers below to WP:RS/N and see at what level people wish to rate these, but the descriptions of the ratings would still not quite make sense: would we have to remove the Algerian and Belarusian COVID-19 data and graphs if the decision for both were "deprecated"? We would really need something like "generally-reliable-but-OK-as-official-data" and "deprecated-but-OK-as-official-data". The RS ratings symbols and definitions (e.g. listed at WP:RSP) already have a lot of nuances, and the cognitive burden of editors working through these is already significant. It's not as simple as yes/no. This is again why I don't think that official sources of open government data would quite fit into WP:RS/N; they would require making the ratings system even more nuanced/complicated. Boud (talk) 17:08, 8 April 2022 (UTC)

Wikidata methods, possible example
Copied from User talk:Jsamwrites for convenience: "This is an interesting problem. The Wikidata community is trying to tackle this problem to a certain extent for scholarly articles. Take, for example, Wikidata has properties that help to track certain disputed statements or even the retraction of scientific articles: is retracted by or statement disputed by. Some external applications use is retracted by to highlight that a scholarly article (already on Wikidata) has an associated retraction. Though, I am not sure whether these have been used for open data(sets), which are tracked by external data available at or open data portal. This requires some more study and concrete examples." Jsamwrites (talk) 15:30, 22 May 2022 (UTC)

I took the liberty of pasting your comments here, because other people might potentially be interested (e.g. once the Board election is over...). To continue from your comments, here's a sketch of what might be doable in the case of the SARS-CoV-2 infection count data from WP C19CCTF: I'm unlikely to get into Wikidata to try something like this, especially as I have a COI, but this could open up the way for usage for similar analyses of open medical data, or of open electoral data, or of other open government data. This could help avoid a Manichean view of open government data, either "it's all nonsense" or "it's all true because it's official", which risks evolving. Boud (talk) 23:25, 2 September 2022 (UTC)
 * https://zenodo.org/record/5262698 - frozen Zenodo record of input data, software, results, full reproducibility package
 * https://zenodo.org/record/5262698/files/WP_C19CCTF_SARSCoV2.dat - frozen version of the WP C19CCTF data, with individual article and revision IDs that uniquely define individual open data sets per country, e.g.
 * DZ (Algeria): wgArticleId=63438177 wgCurRevisionId=1020397525
 * the next question is what would could be used as a Bayesian probability based on the abstract of the peer-reviewed version?
 * suggestion: create something like wikidata:Property:Pnnnn|suspicion of falsification with allowed values negligible, moderate, high corresponding to phi_i^28 column 6 of https://zenodo.org/record/5262698/files/phi_N_28days.dat and the abstract:
 * negligible: phi_i^28 >= 0.5
 * moderate: 0.1 <= phi_i^28 < 0.5
 * high: phi_i^28 < 0.1
 * anyone using the data would have to choose how to convert these to prior numeric probabilities (in principle, an independent research project + article would be needed to do it properly), but that would be added to Wikidata only later, if and when the research were done and published
 * the wikidata element of the research article would be used to justify associating these properties with the data sets
 * this would, it seems to me, require creating Wikidata elements for each of the 78 countries' data sets that were validly analysable in the 28-day analysis (grep -v ^# phi_N_28days.dat | awk ...)
 * these elements would have to be associated with the countries and with SARS-CoV-2 or the COVID-19 pandemic, in order for them to be potentially used for broader purposes of building up data reliability estimates
 * associating the countries' ministries of health (or equivalent) with these would make sense