Wikipedia:Reference desk/Archives/Mathematics/2016 April 12

= April 12 =

Using a trapezoidal mean as an approximate fixed average to reflect a fluid error?
Let's say you want to regularly weigh yourself but don't want to undress each time, so you decide to create a "clothes difference" ("CD"), where CD = fully clothed weight - near naked weight (since just weighing the clothes on a scale may not provide an accurate reading). So let's use a sample size of 30, which provides a sum of 159.25# and a total average of 5.308333#. The question is, given all of the different types of averaging (particularly weighted averages), one method I have created is a Lo and Hi five, where Lo_5 = 24.5#/5 = 4.9# and Hi_5 = 28.25#/5 = 5.65#, leaving a Mid_20 = 106.5#/20 = 5.325#. If you average Lo_5-Mid_20-Hi_5 evenly, you get (Lo_5+Mid_20+Hi_5)/3 = CD_lmh = 5.291667#. The interesting twist is if you change it to a trapezoidal mean: CD_t = Lo_5/4+Mid_20/2+Hi_5/4 = 5*Lo_5/20+20*Mid_20/40+5*Hi_5/20 = (CD+CD_lmh)/2 = 5.3#. Would this (giving the Lo and Hi extremes only half the weight-no pun intended :)-as the mid section) be a good choice to use as a fixed average of a quantity (CD) that has a slight error factor variation of about ±(Hi_5-Lo_5)/2 ≈ ±.375# from 5.325#?166.186.171.129 (talk) 01:38, 12 April 2016 (UTC)
 * I think what you have in mind is something between a truncated mean and the ordinary arithmetic mean, where instead of throwing out the highest and lowest measurements entirely you're reducing their contribution by half. But there may be a problem with the implementation in that you're using an "average of averages". If you look carefully at the contribution each data point makes you get the following:
 * {| class="wikitable"

! Average Type !! Low 5 !! Middle 20 !! High 5
 * Arithmetic || 1/30 || 1/30 || 1/30
 * Lo_5 || 1/5 || 0|| 0
 * Mid_20 || 0|| 1/20 || 0
 * Hi_5 || 0|| 0 || 1/5
 * CD_lmh || 1/15 || 1/60 || 1/15
 * CD_t || 1/20 || 1/40 || 1/20
 * }
 * So the the extreme points are contributing more than the central points in both CD_lmh and CD_t. In fact the arithmetic mean would be 1/6 Lo_5 + 2/3 Mid_20 + 1/6 Hi_5 so if you're trying to reduce the contributions of the extreme points the coefficients of Lo_5 and Hi_5 would have to be less than 1/6. If you throw out the extreme points entirely, which is the truncated mean, you're left with Mid_20. So perhaps you want the midpoint between Mid_20 and the arithmetic mean, which is 1/12 Lo_5 + 5/6 Mid_20 + 1/12 Hi_5.
 * As to whether a particular choice is good or whether one choice is better than another, it depends on the situation. An important factor is what assumptions you make on the distribution of what you're trying to measure. Typical desirable qualities of a choice include:
 * Being unbiased
 * Having low variance
 * Being able to explain the observed measurements well
 * Not being unduly influenced by outlying data points
 * Being easy to compute
 * The arithmetic mean is often used but notoriously fails when it comes to sensitivity to outliers, which is why the truncated mean was invented. But the truncated mean is more difficult to compute and may be biased depending on the distribution.
 * Since you're ultimate goal is presumably to more easily get an accurate weight for a person, you should be aware that this is a much harder thing to do than most people realize; see The Hacker's Diet for more details. Seeing as even under ideal circumstances you can only be accurate to within a pound or two, to my mind you might as well just take an estimate of 5.5#. --RDBury (talk) 20:53, 12 April 2016 (UTC)
 * Having low variance
 * Being able to explain the observed measurements well
 * Not being unduly influenced by outlying data points
 * Being easy to compute
 * The arithmetic mean is often used but notoriously fails when it comes to sensitivity to outliers, which is why the truncated mean was invented. But the truncated mean is more difficult to compute and may be biased depending on the distribution.
 * Since you're ultimate goal is presumably to more easily get an accurate weight for a person, you should be aware that this is a much harder thing to do than most people realize; see The Hacker's Diet for more details. Seeing as even under ideal circumstances you can only be accurate to within a pound or two, to my mind you might as well just take an estimate of 5.5#. --RDBury (talk) 20:53, 12 April 2016 (UTC)

Wouldn't it be easier to take your clothes off and be naked than to suffer all these mathematical tortures? 175.45.116.59 (talk) 23:02, 12 April 2016 (UTC)


 * And then of course you have to cope with whether you have just eaten a meal or not. I tried weighing myself once a day at a random time for a few months and I was able to see from the results the average time I ate a meal. I was never totally satisfied with my methods though so if you want to apply mathematics to your weight there's a project for you. ;-) Dmcq (talk) 23:41, 12 April 2016 (UTC)