Wikipedia:Reference desk/Archives/Mathematics/2015 December 6

= December 6 =

Income distribution graph
I need help to create an income distribution graph from this table (preferably using MS Excel which I know quite well):

I understand the explanation of what such a graph represents and can read off values correctly but constructing it is a mystery to me. I'm afraid most of the formulas at Lorenz curve look like gobbledygook to me. Thanks Roger (Dodger67) (talk) 18:02, 6 December 2015 (UTC)


 * To clarify, you don't want a simple bar graph, do you ? You want to fit a curve through that data, correct ?  (One bit of advice is to drop the "None" income category, as it will mess up the curve fitting, and probably isn't accurate anyway, due to unreported income, etc.) StuRat (talk) 19:33, 6 December 2015 (UTC)


 * I'm not an economist but here's what I did to get a reasonable facsimile of the graph shown in the article. Start by creating a column (column C) with income cutoffs starting with 0. I somewhat arbitrarily filled in 10000000 as the top value (the last entry) since infinity isn't really an option. Next I put the percentages in column D next to it. Then I made a third column (column E) which is the product of the previous two columns, easy enough to do by creating a formula for the first cell and copying it to the others. The next step was slightly trickier, namely to make columns which are cumulative sums of the percentages and the products. I imagine MS has made a macro for this somewhere, but I find it easier to do it by hand rather than try to find it. So I entered 0's in the row above the other data, but in column G and H (leaving F blank). Then I put in formula for the sum of the previous entry in the column and the corresponding entry in the percent column and copied this down to get cumulative percentages. If you do this correctly you should end with 99.9% in the last cell. (There is a .1% error presumably from rounding.) Then I did something similar to get cumulative products. Finally I selected the data in G and H, found Insert Scatter (X, Y) in the menu, then picked Straight Lines as a subtype and the chart appeared. Your version of Excel may put this in a different location than mine but it will be there somewhere. I'm not sure what the units on the vertical axis are, but you can resolve that by dividing by the last entry in the cumulative products column, so the scale goes from 0 to 100%.


 * I agree that the article on the Lorenz curve could be better. In particular there needs to be more explanation without jargon before getting into technical stuff involving Pareto distributions etc. There are plenty of other ways to show the same data though; pie charts come to mind since it involves percentages. Btw, there is a screencast showing how to do this; I haven't viewed the whole thing so it might not be applicable but hopefully it will help. The method I described above is independent of this, so take it for what it's worth if there are major differences. --RDBury (talk) 19:42, 6 December 2015 (UTC)