Wikipedia:Reference desk/Archives/Mathematics/2012 April 30

= April 30 =

question about one sided limits
f is differentiable at all points except possibly at 0, $$\lim_{x \rightarrow 0+} f'(x) = \lim_{x \rightarrow 0-} f'(x)$$ and f is a continuous at 0. Probe that f is differentiable at 0.--49.178.5.29 (talk) 00:05, 30 April 2012 (UTC)


 * And if you're not an alien with your probe handy, you can prove it instead. :-) StuRat (talk) 00:12, 30 April 2012 (UTC)


 * Hint: Use the definition of derivative. To bound $$f(\Delta x)$$, use an intermediate point $$x_0$$, bound $$f(x_0)$$ and $$f'(x)$$, and use the Mean value theorem. -- Meni Rosenfeld (talk) 08:48, 30 April 2012 (UTC)

Calculating the 3 marks that drag a weighted average down the most
Hello,

Apologies if this is a very easy problem, I’m certainly no mathematician.

My university calculates an ‘honours mark’ for each student. The honours mark is basically an average of each student’s marks, weighted according to the number of credit points attributed to that subject. However, Students may discount their 3 ‘worst’ subjects.

If a student’s 3 worst subjects are those that drag their weighted average down the most, how does one go about calculating what those 3 marks are? (Apart from trial and error).

Thanks, Joaq99 (talk) 04:16, 30 April 2012 (UTC)


 * Perhaps you could find the weighted average of all the classes, then find the 3 whose deviations from that average, when multiplied by the number of credits, is the most negative number ? StuRat (talk) 04:28, 30 April 2012 (UTC)


 * StuRat: can you prove that rule correct? Because I think it might not be.  &#x2013; b_jonas 11:20, 30 April 2012 (UTC)


 * StuRat: for example, suppose you wanted to throw away just two marks from the following four:
 * {| class=wikitable

! name || grade || weight
 * P || 1 || 9
 * Q || 1 || 1
 * R || 3 || 10
 * S || 5 || 10
 * }
 * Now the weighted average is 3, so the deviations from average multiplied by the weight are, respectively, -18, -2, 0, 20, so if I understand your rule correctly, you'd throw away the marks for subjects P and Q, which would give a weighted average of 4. However, it's better to throw away subjects P and R, as that would lead to a grade average near 4.64.  &#x2013; b_jonas 11:30, 30 April 2012 (UTC)
 * S || 5 || 10
 * }
 * Now the weighted average is 3, so the deviations from average multiplied by the weight are, respectively, -18, -2, 0, 20, so if I understand your rule correctly, you'd throw away the marks for subjects P and Q, which would give a weighted average of 4. However, it's better to throw away subjects P and R, as that would lead to a grade average near 4.64.  &#x2013; b_jonas 11:30, 30 April 2012 (UTC)
 * Now the weighted average is 3, so the deviations from average multiplied by the weight are, respectively, -18, -2, 0, 20, so if I understand your rule correctly, you'd throw away the marks for subjects P and Q, which would give a weighted average of 4. However, it's better to throw away subjects P and R, as that would lead to a grade average near 4.64.  &#x2013; b_jonas 11:30, 30 April 2012 (UTC)


 * In fact it's not even true when you need to remove one mark. If A, B and C have weights 1, 2, 1 and grades 1, 0, -0.3, then the StuRat power of B is -0.35 and for C is -0.475, but the average after removing B is 0.35 and after removing C is 1/3. -- Meni Rosenfeld (talk) 11:45, 30 April 2012 (UTC)


 * Yes, my method is an approximation, which works in the real world, where you don't get a 10 to 1 ratio in the number of credit per class. To add a bit of a safety factor to it, you could try removing, say, each 3 of your 4 "most negative" classes, calculated by the method I specified. StuRat (talk) 17:21, 30 April 2012 (UTC)


 * What is this "real world" of which you speak? :)
 * Anyway, you'll find that my counterexample has 1:2 weight ratio. Your method is most accurate when there are many classes, in which case removing items doesn't have a great effect on the denominator. -- Meni Rosenfeld (talk) 18:32, 30 April 2012 (UTC)


 * Yes, and also where class grades tend to vary more than class credits. StuRat (talk) 19:07, 30 April 2012 (UTC)


 * [ec] I don't think this is a trivial problem. StuRat's suggestions is an approximation but the exact result is different, and the marks that are each optimal individually needn't be optimal together.
 * A mark that is dominated by at least $$k=3$$ other marks (they each have both lower grade and higher weight) cannot be in the optimal set, so these can be discarded (which is $$O(n^2)$$ and can reduce the effective value of n). But other than that I don't know of a better way than scanning all possibilities, which is $$O(n^k)$$ in the general case.
 * Some optimization is possible by calculating each average in $$O(k)$$ rather than $$O(n)$$. -- Meni Rosenfeld (talk) 11:36, 30 April 2012 (UTC)


 * Meni: nice example for removing a single grade. I agree that this is an interesting mathematical problem, even though with real life students and grade averages it's feasable to do a brute force computer solution, or a hand computation with fast runtime for typical input.
 * Now as for the actual question. If you wanted to remove just one grade, then you could compute the average without each grade all in linear time.  Would it give the correct result to just iterate this, repeatedly throwing away a grade in a greedy way?
 * Also, as a clarification, can we assume that the grades are limited to just a few values (say integers from 1 to 5)? If so, that would make this simpler.  &#x2013; b_jonas 12:41, 30 April 2012 (UTC)
 * Greedy doesn't work. Let A, B, C, D have grades 7, 2, 2, 0 and weights 2, 2, 2, 1. If you can remove one mark it should be D, but if you can remove two it's B and C. -- Meni Rosenfeld (talk) 12:59, 30 April 2012 (UTC)


 * Huh? I can't reproduce that one. If you can remove two, it should be C and D to get an average of -1, because if you removed B and C the weighted average would be -1.4.  &#x2013; b_jonas 14:59, 30 April 2012 (UTC)


 * Sorry, had a typo, weight of A should be 1 (2 now that I've rescaled). -- Meni Rosenfeld (talk) 15:45, 30 April 2012 (UTC)


 * Ah, indeed, it does work that way. Now I should try to understand why it works.  &#x2013; b_jonas 16:15, 30 April 2012 (UTC)
 * I can't say I truly get it myself. But here are two ways to think about it (which may be easier now that I've rescaled the example to use only nonnegative integers):
 * Use the StuRat approximation. D has slightly more StuRat power so if only one item is removed, it should be it. If two need to be removed, then clearly B needs to be one of them. Once that's done, the average is higher; since C has greater weight than D, this has a greater effect on its StuRat power, which now exceeds D's. (This may or may not be literally true for this example, didn't check).
 * Consider the numerator and denominator of the weighted average, and how removing an items affects them both. With one item removed, it should be D because of its effect on the numerator. The less the total weight, the more significant is the effect on the denominator; so after B is removed, C is next because removing it greatly reduces the denominator.
 * -- Meni Rosenfeld (talk) 18:32, 30 April 2012 (UTC)


 * Another way to look at it: I have a backpack. I want to cram it full of grades. I can only hold so much weight and the value per weight of each grade is not the same. How can I maximize the value? Remind anyone of an old, well-documented problem? — Preceding unsigned comment added by 128.23.112.209 (talk) 18:38, 30 April 2012 (UTC)

The invalidity of the one-at-a-time approach strikes me as somehow related to Simpson's paradox -- both involve comparing things with different but overlapping denominators.

Also, this problem of efficiently finding the ones to remove seems very similar to this problem: given a regression equation in which it is postulated that k of the n data points are outliers that should not have been included in the regression (because they may be from a different causative structure), how do you efficiently find the set of k data points whose deletion will most improve the fit of the regression? See for example Cook's distance and Outlier. Duoduoduo (talk) 19:31, 30 April 2012 (UTC)

Just an aside, but the harder the true answer is to find, the less likely that the university is actually applying the "correct" solution. Joaq99, I would surprised if the university has really thought this out. More likely someone is doing something simple, like canceling the three lowest grades regardless of weight. While the mathematical puzzle is undoubtedly interesting, you might be more likely to get at the truth by inquiring with your university about what procedure they are actually using to remove the "worst" grades. Dragons flight (talk) 19:45, 30 April 2012 (UTC)

This http://www.ics.uci.edu/~eppstein/pubs/EppHir-TR-95-12.pdf paper addresses this particular problem. --Modocc (talk) 20:55, 30 April 2012 (UTC)


 * Dragons flight: in a realistic case, it's completely feasable for the university to even use a brute force computation with a computer. We have implicitly started discussing the abstract problem where the number of grades can be large, and the allowed grade values and the allowed grade weights needn't be integers taken from a very small predetermined set.  &#x2013; b_jonas 21:26, 30 April 2012 (UTC)


 * Modocc: nice find! That's indeed exactly the same problem.  &#x2013; b_jonas 21:29, 30 April 2012 (UTC)
 * Notably, they give an $$O(n)$$ algorithm. -- Meni Rosenfeld (talk) 05:21, 1 May 2012 (UTC)


 * From the above conversation it seems that a 'brute force' calculation is the way to go. I'm assuming this refers to a computer program that calculates the weighted average eliminating every possible combination of 3 and simply takes the maximum of all the weighted averages.


 * I would like to thank everyone for their efforts (even if you enjoy it, it's much appreciated). I'm interested in this discussion even beyond its practical implications for me.


 * Dragons flight -- The university was initially removing the 3 subjects with the lowest (mark * weight). When I complained that this was clearly wrong, the university said they would change their approach to simply removing the 3 subjects with the lowest marks. I'm unhappy with this approach too but I thought that before I proceeded with the complaint I should try and figure out the correct way to do so. I soon figured out that finding such a method was beyond my high school mathematics skills. Joaq99 (talk) 01:30, 1 May 2012 (UTC)
 * Lowest mark*weight, or lowest (mark-average)*weight (StuRat power)? So a grade of 1 with weight 1 will be removed rather than a grade of 1 with weight 100? Most negative StuRat power is a handy approximation; lowest grade is a handy approximation; lowest mark*weight is just idiotic and whoever is responsible for it should be fired.
 * They should just use brute force. It takes exactly 5 minutes to write a program to do that. Is the final grade of all university students really that unimportant? -- Meni Rosenfeld (talk) 04:30, 1 May 2012 (UTC)


 * Good luck with that Joaq. I've been part of several universities, and though they all had many smart faculty, I've pretty much invariably found that the staff responsible for processing grades, transcripts, and the like had trouble understanding all but the most trivial of mathematics.  I'm not sure why that should be, but it certainly seems like whatever the actual requirements for that job (mostly clerical skills I assume), that numerical literacy was not among them.  The only calculations I really trust them to do and describe accurately are the very simplest imaginable.  Of course that's just my experience, your experience may be better.  Dragons flight (talk) 05:29, 1 May 2012 (UTC)


 * A closer look: If the sum of all weights is W and the sum of all weight*grade is T, then the average starts at $$A=\frac{T}{W}$$. If a class with weight $$w_1$$ and grade $$g_1$$ is dropped, the new average is $$\frac{WA-w_1g_1}{W-w_1} = A + \frac{(A-g_1)w_1}{W} + (A-g_1)\left(\frac{w_1}{W}\right)^2+O\left(\left(\frac{w_1}{W}\right)^3\right)$$. The second term has the StuRat power in it which is why the approximation works when W is large enough.
 * If you drop two classes, the average is
 * $$\frac{WA-w_1g_1-w_2g_2}{W-w_1-w_2}=A + (A-g_1)\left(\frac{w_1}{W}+\left(\frac{w_1}{W}\right)^2\right) + (A-g_2)\left(\frac{w_2}{W}+\left(\frac{w_2}{W}\right)^2\right)+(2A-g_1-g_2)\frac{w_1w_2}{W^2}+O\left(\left(\frac{w_1+w_2}{W}\right)^3\right)$$
 * That is, to second order you get the same terms as with dropping each individually, but with the additional term $$(2A-g_1-g_2)\frac{w_1w_2}{W^2}$$ which can give an advantage over the best individual class to drop. In particular, a class which has a low weight and low grade will greatly reduce the $$w_1w_2$$ part (because the weights are multiplied) but will only somewhat increase $$(2A-g_1-g_2)$$ (since the grades are added, and a doubling of the distance from the average is diluted by the other grade), thus it may be better to drop two classes of higher grade but also higher weight. -- Meni Rosenfeld (talk) 10:58, 1 May 2012 (UTC)


 * The Eppstein-Hirschberg paper cited above by Modocc, despite a promising opening paragraph, is not about the same problem as we are talking about here. That paper maximized $$\frac{\sum _{i \in T}v_i}{\sum _{i \in T}w_i}$$ where $$v_i$$ is a grade and $$w_i$$ is a weight (see their eqs. (1) and (2)). But the grade-point averages that we are trying to maximize are $$\frac{\sum _{j \in T}v_j*w_j}{\sum _{i \in T}w_i}$$. Duoduoduo (talk) 17:34, 1 May 2012 (UTC)
 * Our problem is trivially reduced to the problem in the paper by setting $$v_i=g_i\cdot w_i$$. -- Meni Rosenfeld (talk) 18:12, 1 May 2012 (UTC)