Talk:Histogram/Archives/2013

Travel times
The table of travel times does not clearly show the intervals. Nijdam (talk) 11:57, 9 January 2013 (UTC)

Error: "Doane's formula"
This little part was introduced in [| this_edit]

... and it's been wrong ever since. Doane's formula doesn't refer to kurtosis at all (if you're going to link a reference, read it for goodness sake!).

Indeed, clearly the formula has been taken from some other source, because it's been changed in several ways from Doane's paper.

Doane discusses skewness. Since I don't know what reference the given formula actually came from, I am going to change it to Doane's actual formula, and that way formula and reference will match.

Glenbarnett (talk) 05:23, 5 April 2013 (UTC)

Interval explanation not complete
The article does not express the interval conventions for bins. For example, the article includes an example with bins "10.5–20.5" and "20.5–33.5", but it is not clear if 20.5 would be counted in the 10.5-20.5 bin or the 20.5-33.5 bin. Essentially, I think the article needs clarifcation about whether the intervals are by convention [X1, X2) or (X1, X2] when interpreting the notation "X1-X2" Thelema418 (talk) 01:09, 15 July 2013 (UTC)

ERROR in description of census histogram
In this Wikipedia article we read: Table 2 below shows the absolute number of people who responded with travel times "at least 15 but less than 20 minutes" is higher than the numbers for the categories above and below it. This is likely due to people rounding their reported journey time. VERY LIKELY that sentence should be substituted by: Table 2 below shows the absolute number of people who responded with travel times "at least 25 but less than 30 minutes" is lower than the numbers for the categories above and below it. This is likely due to people rounding their reported journey time, because if the exact value is 28 minutes, they have in mind something as "about half of an hour" and put it in the class beginning from 30. — Preceding unsigned comment added by 151.49.204.251 (talk) 09:15, 3 March 2012 (UTC)

The whole "This is likely due to people rounding their reported journey time. The problem of reporting values as somewhat arbitrarily rounded numbers is a common phenomenon when collecting data from people." blather makes it sound like this is a defect in respondents, when actually it's a flaw in the data-collection methodology. The census form presumably gave people a set of bins to chose from or asked them to report their average (which the census bureau then binned). People whose journey time varies from day to day so as to fall sometimes in one bin, sometimes in another, are then obliged to pick one of the applicable bins, as if their journey time didn't vary that much. If asked for an average, most simply haven't gathered the data from which to compute it. Either way, they're going to give an answer which is quite properly rounded according to the rules for giving only as many significant digits of a value as are not rendered invalid by the error bar, here larger than the bin-size. (In binary, a time varying between a quarter hour and thrice that, for example, is properly rounded to a half; respondents are more or less certainly doing exactly this, without thinking consciously in binary, when they round to half an hour.) A better reporting methodology would be to ask respondents (without using technical jargon) for their quickest, slowest and typical journey-times; you'll still get silly reporting issues but at least the respondent won't feel obliged to give an answer they consider wrong. Those analysing the data then have to actually use their brains somewhat to decide how to combine responses with wildly different error bars, but at least they have a more meaningful data-set. -- Eddy 84.215.6.238 (talk) 12:16, 9 October 2013 (UTC)

ERROR in description of census histogram
In this Wikipedia article we read: Table 2 below shows the absolute number of people who responded with travel times "at least 15 but less than 20 minutes" is higher than the numbers for the categories above and below it. This is likely due to people rounding their reported journey time. VERY LIKELY that sentence should be substituted by: Table 2 below shows the absolute number of people who responded with travel times "at least 25 but less than 30 minutes" is lower than the numbers for the categories above and below it. This is likely due to people rounding their reported journey time, because if the exact value is 28 minutes, they have in mind something as "about half of an hour" and put it in the class beginning from 30. — Preceding unsigned comment added by 151.49.204.251 (talk) 09:15, 3 March 2012 (UTC)

The whole "This is likely due to people rounding their reported journey time. The problem of reporting values as somewhat arbitrarily rounded numbers is a common phenomenon when collecting data from people." blather makes it sound like this is a defect in respondents, when actually it's a flaw in the data-collection methodology. The census form presumably gave people a set of bins to chose from or asked them to report their average (which the census bureau then binned). People whose journey time varies from day to day so as to fall sometimes in one bin, sometimes in another, are then obliged to pick one of the applicable bins, as if their journey time didn't vary that much. If asked for an average, most simply haven't gathered the data from which to compute it. Either way, they're going to give an answer which is quite properly rounded according to the rules for giving only as many significant digits of a value as are not rendered invalid by the error bar, here larger than the bin-size. (In binary, a time varying between a quarter hour and thrice that, for example, is properly rounded to a half; respondents are more or less certainly doing exactly this, without thinking consciously in binary, when they round to half an hour.) A better reporting methodology would be to ask respondents (without using technical jargon) for their quickest, slowest and typical journey-times; you'll still get silly reporting issues but at least the respondent won't feel obliged to give an answer they consider wrong. Those analysing the data then have to actually use their brains somewhat to decide how to combine responses with wildly different error bars, but at least they have a more meaningful data-set. -- Eddy 84.215.6.238 (talk) 12:16, 9 October 2013 (UTC)