Wikipedia:Reference desk/Archives/Science/2021 May 20

= May 20 =

Significant [to a human] figures
A question involving numbers, but hardly involving mathematics. Let's go to Google Ngram Viewer for some "statistics". It tells us that in 2006, "Jimmy Carter" accounted for 0.0000860588% of X, and "Ronald Reagan" accounted for 0.0001918622% of the same X. I can't be bothered to look up what X is, but I can get my phone to calculate a quotient: tokens of (or hits for) "Jimmy Carter" in 2006 were 0.448544841036% of those for "Ronald Reagan". However, such a degree of precision is misleading; let's call it 0.44854%. (If I understand correctly, this has about the right number of "significant digits", in a mathematical sense. Though perhaps I misunderstand.)

For many purposes, it would be enough to say that there were "just less than half" as many tokens of "Jimmy Carter" as there were for "Ronald Reagan". But for many purposes one should be a little more precise. My own gut feeling (!) is that unless one intended to do further calculations from the figure, "0.448%" would be adequate for any descriptive purpose: that the human brain is such that no psychologically normal person would be impressed one way or another if informed that it was 0.4485% and not 0.4484% or 0.4486%. Three figures -- here, "448" -- are all that are, well, significant.

Does my gut feeling correspond to any named psychological phenomenon? I dimly remember that in my long-ago youth I was taught that, quite aside from limits imposed by measuring inaccuracy, etc, I should almost never go beyond "three significant figures" (and thus that 0.000044854 should be rendered with no more precision than "0.0000448"): this still seems sensible to me, but "significant figure" seems to be a misnomer. -- Hoary (talk) 08:10, 20 May 2021 (UTC)
 * In terms of significant figures for dividing among a 6-sig-fig number and a 7-sig-fig number, the quotient should probably have 6 sig-figs. One reason to report with fewer (your thought about rounding it to just 0.448% in less-technical contexts) is to avoid precision bias. DMacks (talk) 09:08, 20 May 2021 (UTC)
 * The entire purpose of significant figures is that math can't make measurements better. All measuring techniques have a limit to how precisely they can reliably measure something; a ruler can only so many markings, a digital readout on a scale can only report so many digits, etc.  Any math you do with those measurements should never be reported to a greater level of precision than your initial measurements had.  In order to know whether or not there is a false precision in the numbers the OP is reporting is to go back to the original measurements and how they were made, and then preserve the precision of those measurements through subsequent calculations.  -- Jayron 32 11:19, 20 May 2021 (UTC)
 * Thank you, and . I hadn't heard of "precision bias"; it's related to what I fuzzily have in mind. I think I understand the point about maintaining precision through any calculation, while not deluding oneself about the degree of precision that has been maintained. According to my older understanding, both the numbers 0.0000675421 and 675,421*(10^6) have "six significant figures", regardless of the care that went into producing these numbers, etc; however, the article Significant figures says that "only reliable digits can be significant". As one example from a pool of thousands, the article History of Berlin tells us that the city's population in "2003" was "3,388,477", which I used to think had "seven significant figures" even though I was well aware that, in this context, the last three (or perhaps even four) of these were meaningless; I hope I can say "despite having just three or four significant figures, is presented as having seven". -- Hoary (talk) 12:32, 20 May 2021 (UTC)


 * There's a tendency to read too much into extra digits. You read about people who memorize Pi to 100 digits or whatever. Those digits may be mathematically significant, but each one is only in the region of one-tenth order of magnitude of the previous one. So for most practical purposes, Pi to just a few significant digits is sufficient. A more mundane example is batting averages. Say the top two batters each finish the season with a .325 average. So who wins the batting championship? You take the computation of hits / at-bats to one or more extra digits to figure it out, say .32533 vs. .32531. You could publish batting averages to ten digits, for example, but that would be excessive. Three digits works fine until you get an apparent tie. ←Baseball Bugs What's up, Doc? carrots→ 11:57, 20 May 2021 (UTC)


 * It's somewhat beside the point, but still germane: .448% is not even close to "just less than half"; it's just less than half of a percent. In your figures in the original post, the 0.448% values are incorrect. It's actually 44.8% (to whatever number of digits you end up finding appropriate). Matt Deres (talk) 14:16, 20 May 2021 (UTC)


 * Well of course, . Howls of derisive laughter! (At myself.) I blame my (salaried) work, which yesterday was boring me so severely that my tired brain sought sustenance here but underestimated its own tiredness. However, you might instead attribute it to straightforward senility (as hinted at by my username). Whatever the lame excuse, truly embarrassing. -- Hoary (talk) 22:33, 20 May 2021 (UTC)


 * One factor affecting what is an appropriate number of significant figures is the statistical error associated with your estimate. If you measured the same thing with a fresh sample of similar size, how much is the figure likely to vary? For instance, if you are measuring the fairness of a coin and toss it n times with a proportion b turning up heads, then the standard deviation of b is the square root of (b * (1-b)/n). If you toss the coin 100 times and find that b = 0.410, the standard deviation is 0.024, so quoting b to 3 significant figures like I just did would make little sense and would convey a misleading sense of accuracy. But it gets more appropriate to do so if you had tossed the coin 10,000 times (sd = 0.0024). It is not always so straightforward to work out the standard deviation of your estimate mathematically, but gut feeling may guide you. Jmchutchinson (talk) 14:19, 20 May 2021 (UTC)


 * The Weber-Fechner law and just-noticeable difference are relevant here, especially the diagram at Weber-Fechner law. If we have a number sense that goes up to maybe 100, then differences in any digits beyond the first ~2 don't make the number "feel" much larger or smaller. The extra digits are distracting, and (at least in my experience) attention is drawn to the rightmost number, which is least significant! (Of course, for other purposes like detectability or statistical significance, a lot of digits may be relevant. Measuring and calculating g-2 to more digits might help discover new physics, but it won't make the number "feel" any different to me.) --Amble (talk) 22:06, 20 May 2021 (UTC)


 * Thank you, and . A first glance suggests that what Amble is proposing is close to or even the same as what I had in mind. I'll read the articles Amble points to carefully (or try to). If I'm baffled ... yesterday, as Matt Deres points out above (with entirely unjustified gentleness), I made a moronic cock-up of something related; so it's better if I draft any further question and poke at the draft with a stick intermittently over a period of hours to reassure myself that it isn't idiotic, and only if I succeed go ahead and post it. -- Hoary (talk) 22:33, 20 May 2021 (UTC)