User:Bob K31416/N

The generation of numerical summaries of data using routine techniques such as summation or the calculation of averages, standard deviations and other processes that are standard spreadsheet functions are not "original research".

There are two things that must be kept in mind: In case of doubt (summaries vs. statistical reinterpretation), discuss first.
 * 1) Numerical summaries can only be made if they make sense in the context. In practice this mean that summaries can only be made:
 * 2) if units of measurement are identical (adding 5 miles to 6 kilometers and arrive at 11 makes no sense)
 * 3) if (social) constructs are operationalized in the same way (e.g. London city is much larger than Athens city, in part since Athens is divided in many independent municipalities, while London is one city – The UK and Greece operationalize cities differently, hence summaries of numbers of inhabitants of cities across the UK and Greece makes no sense).
 * 4) the type of data is appropriate to the chosen operation (e.g. it is possible to calculate average and standard deviation of gender in a population, but the numbers Average gender=0.51 female, SD=0.50 make no sense – The average person is either male or female not 51% female).
 * 5) Interpretation of the data using statistical methods is "original research". For example, stating that the average height of a group of 200 people was 180 cm and the standard deviation was 8 cm is not original research, but to make the statement "therefore we can expect 136 people (68%) to have a height of between 172 cm and 188 cm (180 ± 8 cm" is original research (unless it is being used as an example in an article on how to manipulate statistical data).


 * Oppose — For the following reasons:
 * 1) Re "The generation of numerical summaries of data using routine techniques such as summation or the calculation of averages, standard deviations and other processes that are standard spreadsheet functions are not "original research". " — I think that unless the result is reasonably obvious to most readers, it should be considered OR. For example, the average of a few numbers would be reasonably obvious to most readers, but not the average of many numbers. I don't think that the standard deviation is reasonably obvious to most readers in any case; in other words, most readers seeing a calculated standard deviation in an article would not have an inkling about whether it is right or wrong. That's why reliable sources are useful so that the reader can see that someone credible has made the calculation, rather than an anonymous contributor to Wikipedia whose credibility is consequently unknown.


 * 2) Re the part: "1. There are two things that must be kept in mind:" — This is an inappropriate digression for this policy page since it instructs (in a questionable way) how to analyze data, rather than how to avoid OR.


 * 3) Re "For example, stating that the average height of a group of 200 people was 180 cm and the standard deviation was 8 cm is not original research" — I'd say that should be considered OR. From the Routine calculations section,
 * "Basic arithmetic, such as adding numbers, converting units, or calculating a person's age, is allowed provided there is consensus among editors that the calculation is an obvious, correct, and meaningful reflection of the sources."
 * I think what is meant here is that the result of the calculation is obvious. In the case of 200 people, the average and standard deviation isn't obvious. In the case of a few people, the average would be somewhat obvious but the standard deviation would not.

--Bob K31416 (talk) 17:43, 6 July 2013 (UTC)