Wikipedia:Reference desk/Archives/Mathematics/2007 April 12

= April 12 =

Imaginary Interest Rate
The Martian Federal Bank of tyranny forces all citizens to deposit their saving into the federal bank. The federal bank gives all depositors a generous interest rate of 4% per annual (calculated and credited annually).

Because of the ongoing War Against Terra, it is the compulsory duty of all citizen to voluntarily donate all their interest earnings to the First Tyrant. Furthermore, the interest on interest earned is also deducted from the Citizen's Share as "account keeping fees".

Starting with a deposit of $100 on year zero. The amount in the bank account after 5 years is:

Money[year]= $100 * ( 1 + 0.04i )^year
 * Year  Total deposit       Citizen's Share   Tyrant's Share
 * 0       100          =       100                  0i
 * 1       104          =       100                  4i
 * 2       107.84       =        99.84               8i
 * 3       111.514      =        99.52              11.99i
 * 4       115.015      =        99.04              15.97i
 * 5       118.337      =        98.40              19.93i

Is this correct?

202.168.50.40 04:37, 12 April 2007 (UTC)


 * I don't think the "account keeping fees" would be called interest, since it's a flat fee that's not time related. It's just a 4% processing fee calculated from the interest? --Wirbelwind ヴィルヴェルヴィント (talk) 04:55, 12 April 2007 (UTC)

I don't quite get the rules. If they forfeit all their interest, then how can there be interest on the interest ? Do you just pretend the interest was retained for that calculation ? StuRat 01:19, 13 April 2007 (UTC)


 * The citizen has to forfeit the interest. However the interest is still kept in the bank account until such time as the Tyrant chooses to withdraw it. The citizen however cannot withdraw the interest from the account, the citizen can only withdraw the Citizen's Share if the citizen chooses to do so. 202.168.50.40 01:39, 13 April 2007 (UTC)


 * In that case, if the interest on the interest is calculated normally, the total should be 108.16 at year 2, not 107.84, because the 4% of 4%, which is 0.16, is added to the total, not subtracted from it. StuRat 01:51, 13 April 2007 (UTC)


 * Furthermore, the interest on interest earned is also deducted from the Citizen's Share as "account keeping fees". Because i * i = -1 202.168.50.40 03:35, 13 April 2007 (UTC)


 * Wait, you're saying they have an imaginary number for a value? --Wirbelwind ヴィルヴェルヴィント (talk) 05:32, 13 April 2007 (UTC)


 * If a negative amount is deducted, isn't that the same as adding a positive amount? --Lambiam Talk  08:37, 13 April 2007 (UTC)

Unfamiliar Curve
Three points on the plane: A, B, and C. A and B are fixed, and C can be anywhere that satisfies (AC) = k*(BC) with k a constant. That is, the distances between A and C and B and C are proportional. Does anyone know what this curve is called, or anything about it? The best I can come up with is that it looks ellipse-ish, except in the rare case that k=1 and it's a line. Black Carrot 06:40, 12 April 2007 (UTC)


 * When k=1 it's a line. Otherwise, it's a circle. See Circle. --Spoon! 08:39, 12 April 2007 (UTC)


 * The circles for different values of k form one half of the family of Apollonian circles for the points A and B. Gandalf61 09:47, 12 April 2007 (UTC)


 * ... and these circles are precisely the circles with respect to which A and B are inverses of each other - see this page. Gandalf61 14:44, 12 April 2007 (UTC)

Statistical Significance in an Experiment with 3 Possible Results
There are two players A and B playing a game. There are three possible outcomes, A wins, B wins or the game is a draw. A and B play n games, A wins a times, B wins b times, there are d = n - a - b draws. The respective probabilities x for A winning and y for B winning and z for a draw. They are the same for all games. I want to prove that A is a better player than B, that x is higher than y.

My knowledge of statistics is a little bit odd. I hope that someone on this board can tell me "that is a common problem, just use a so and so test", but so far all tests I found assume an experiment with only two possible results or are concerned with some expected sum or assume normal distributions and whatnot.

My idea so far has been to use the difference of wins (a - b) as statistic. I assume the null hypothesis (y >= x) and want to find a number c so that the probability for (a - b) > c is smaller than 0.1%. If I assume that x = y = 50% a bound for c can simply be computed using a binomial distribution. I believe that all other combinations of x and y yield smaller c values, but I cannot prove it formally.

I have transcribed the problem to the to look like a high school text book version, but in reality I am working on a problem in computer science. I am trying to show that a modification that I made to an algorithm improves the performance of this algorithm over its original verison in the average case. It seemed so simple. ^^ —Preceding unsigned comment added by 84.187.46.139 (talk • contribs)


 * I think you can just discard the draws. The probability that A wins a match that does not end in a draw is p=x/(x+y). Your null hypothesis is that y>=x - in other words, p<=0.5. Your alternate hypothesis is that x>y - in other words, p>0.5. Your statistic is (number of times A wins)/(number of times A or B wins), so a/(a+b). This statistic will have a binomial distribution (assuming results of trials are idependent). If you can control the number of trials, you should do as many trials as is feasible. A large number of trials will allow you to approximate the binomial distribution by a normal distribution, and will also decrease the standard deviation of your statistic, and so increase your confidence level. Gandalf61 13:01, 12 April 2007 (UTC)


 * That sounds good. It is just a computer program running on random input data and I can run it as long as I like to. I could run it until it produces n non-draw results. All runs are independent, so discarding the draws will not bias my results and I will have a sample of size n distributed according to x/(x+y) and y/(x+y). The idea do run it n times and just discard the draws seems questionable to me. I can at least not immidiately see that it must be correct, although it sounds sensible. I need a certain amount of non-draw runs to have a chance at getting significant results anyway, so this is no problem. Thank you for the help. —The preceding unsigned comment was added by 84.187.2.57 (talk) 15:09, 12 April 2007 (UTC).
 * I think it would be correct (that is, valid) to discard draws, but you may be able to do better (that is, have greater sensitivity), by considering including them in a multinomial maximum likelihood context. If indeed it doesn't matter, you should see all terms relating to draws as being the same in both likelihoods, and they would cancel. If not, you know you've done better.  Baccyak4H (Yak!) 15:25, 12 April 2007 (UTC)
 * Update: I have done the math, and is ignoring draws perfectly OK? The answer is...try it for yourself, it is good fun!  Baccyak4H (Yak!) 15:58, 12 April 2007 (UTC)
 * Sensitivity is not *really* a problem here. I just made a quick calculation using chernoff bounds based on the results of the old experiment and the significance level would have been 0.0000000000 0000000000 0000000000 0000000000 0000000000 0000000000 0000000000 0000000000 0000000967%. —The preceding unsigned comment was added by 84.187.2.57 (talk) 16:26, 12 April 2007 (UTC).
 * I would love to do your math, but I don't know how. I see that my sample is a multinomially distributed and I know that a/n is a maximum likelyhood estimate for x and b/n is a maximum likelyhood estimate for y, but I really have no idea what you want to tell me. My knowledge of statistics is sketchy at best. —The preceding unsigned comment was added by 84.187.2.57 (talk) 16:34, 12 April 2007 (UTC).

It's only OK to ignore draws when trying to determine which player is best, as long as you don't want a margin of error calculation or to figure out how much better one player is than another. Consider the case of player A having 1000 wins and B having 100 wins. We could reasonably say that A is a better player. However, if there were no draws that makes A much better than B, while if there were 10,000 draws, A is just a little bit better. A fairer way to compare the cases might be to assign one point to each player for a draw and two for a win. So, in the case of no draws, we would have numbers that favored A by 2000 to 200. In the case of the 10,000 draws, however, A would only be favored by 12,000 to 10,200. StuRat 16:37, 12 April 2007 (UTC)


 * This blog post, When is a "lead" really a lead?, gives a graph illustrating the effect of the proportion of draws (actually non-draws) on the margin of error. They also give a reference for the formula used, which is $$\mathrm{var} (p_A - p_B) = \frac{p_A + p_B - (p_A - p_B)^2}{n - 1}$$, where n is the number of games, and $$p_A$$ and $$p_B$$ are the proportions of games won by players A and B respectively. -- Avenue 22:30, 12 April 2007 (UTC)


 * The question how much better an algorithm is varies a lot with the population from which I draw the random instances. I would have to individually adapt the statistical test for each population to accomodate that or use a rather small and meaningless difference that is met by all populations. I will probably just prove that the algorithm is better for every single population and print the curve to give an idea by how much. —The preceding unsigned comment was added by 84.187.2.57 (talk • contribs).

Probability?
Hi. Ok, I'm not a math guy at all and I believe this is a fairly dumb question but I need help understanding why. So: Assuming 4 billion people and a given time frame, say 24 hours, will something that has a 1 in 1 million chance of happening occur to 4000 of these people (in theory)? I realize there might be problems with my hypothetical set up and all, but maybe you can see what I'm getting at, kind of? maybe?-- 38.112.225.84 10:52, 12 April 2007 (UTC)
 * You're correct, you divide the number of total people by the chance of the thing happening. This gives you the expected number of people that it will happen to. (so long as you're considering equal time-frames). It could still happen to half of them, or none of them, but the expected value will be what you calculate with this method. Capuchin 11:26, 12 April 2007 (UTC)
 * Ok, I think I see what you're saying. But I couldn't necessarily say that by definition something 1 in a million will happen to 4000 of them?-- 38.112.225.84 12:41, 12 April 2007 (UTC)
 * Definitely not. And also, if the event in question has a 1 in 1 million chance of happening at some time in a person's life, or in some long time period, it wouldn't necessarily happen to any of the 4,000 in the course of a single 24-hour period.  JackofOz 12:43, 12 April 2007 (UTC)
 * Ok, I think I'm on board so far. I did mean it to be given that the odds of the event occuring were for that single day and not a lifetime, but thanks for the supplement. Follow-up: Given my initial parameters but working backwards, if a particular event happens to exactly 4000 out of 4 billion people in a 24 hour period, by definition did that event have a 1 in a million chance of occurring(within that period)? It seems that somehow that is wrong, i'm not trying to be deliberately obtuse, but I guess I'm sort of still confused about the causality of probability here...?-- 38.112.225.84 12:41, 12 April 2007 (UTC)
 * I see where you're confused. If you split your 4 billion people up randomly into groups of 1 million. How many people in any particular group would have this event happen to them? There's no way to tell, the groups are set randomly. All you know is that on average, 1 in every group will have the event happen to them. But if you take any single group, you can guess that 1 will have it happen to them, but you might have RANDOMLY picked every single person that the enevt happens to and put them in a single group. This is very unlikely, but very possible. 4000 of them would be in one group, and none in the other groups.Probability is not deterministic, it gives you figures for LARGE and RANDOM populations. If you took each of your 4 billion people one by one, and had to guess when one of them would have this event happen to them (yes or no), then you couldnt, because it would be completely random, they could be all at the end, they could be all at the beginning. You would expect one of them to be in every million that you see, but if you're seeing them in a random order, you have no way to predict. Does this make it any clearer? I think i've babbled a bit :) 213.48.15.234 15:00, 12 April 2007 (UTC)


 * Another consideration is the independence of the events. For example, let's say that hypothetically you have a 1 in 1 billion chance of being killed in a given month by a giant asteroid hitting the planet.  The deaths related to that event are not independent of each other, so the odds are high that if one person died due to a giant asteroid then lots and lots of other people died too.  So even though in this hypothetical example you can say that 4 people die on average per month due to a giant asteroid strike, in fact the number per month is usually zero and on very, very rare occasion in the thousands or millions or billions (depending on the size of the giant asteroid). Dugwiki 15:26, 12 April 2007 (UTC)


 * Wow, I was reading through the responses and was planning on adding a comment on dependent events, and even using the same meteor example, until I saw you had it covered ! Apparently we think alike.  Unfortunately for you, this may mean you should seek psychiatric help immediately. :-) StuRat 16:41, 12 April 2007 (UTC)


 * To generalize the original question, you repeat an experiment with two possible outcomes N times. Let us call the two possible outcomes A and B. You observe the outcome A a total of k times. The outcome of each experiment is independent of the others, and the probability of outcome A being observed is the same value, p, for each experiment. What can we say about the relationship between these three quantities N, k and p?


 * In your original question N = 4·109 and p = 10−6. A is the unspecified "something" happening to one of N individuals. It is a very lazy experimental situation: all you have to do is wait for 24 hours. In the follow-up question N = 4·109 as before, and k = 4000.


 * Now take another experiment, you flip a coin once, and observe heads. Calling heads A, this is another instance of the same with N = 1 and k = 1. Can we now say that p = k/N = 1? In other words, is the probability of heads equal to 1 (= 100%)? Is heads almost surely the outcome? Of course not; it is supposed to be the value that N/k will approach as the experiment is repeated arbitrarily often. Here is a typical example of what you might observe if you flip a fair coin until your arm gets lame: N = 1, k = 1; N = 10, k = 4; N = 100, k = 50; N = 1000, k = 504; N = 10000, k = 5054; N = 100000, k = 49977; N = 1000000, k = 499433. What we can do is say something about how likely it is that k/N deviates from p by a certain amount. If the neither the product pN nor (1–p)N is very small, then most of the time (that is, on average more than half the time) the difference will be less than √(p(1−p)/N), and very rarely more than three times that amount. For the fair coin above with N = 1000000, k = 499433, and p = 1/2, the difference between p and k/N is 0.000567, while √(p(1−p)/N) = 0.0005. See also our articles Binomial distribution and Central limit theorem. --Lambiam Talk  15:50, 12 April 2007 (UTC)


 * Note to the original questioner - don't confuse probability with causality ! There are various ways of interpreting probabilities, but perhaps the simplest is the frequency probability interpretation. In this model, assigning a probability of "1 in a million per day" to an event means that a large group of people have been observed over a long period of time and this event has been seen to occur in one millionth of the "person-days" that have been observed. Then, as long as you are sure that the events are independent (as Dugwiki has pointed out above), if you observe 4 billion people for 1 day, you expect the event to happen to about 4,000 of them - and we can give a quantitative indication of what "about" means, as Lambian showed above. However, this 4,000 value is not an exact prediction, it is an expected value based on a huge amount of "previous experience" that is condensed into the "one in a million" probability figure. Gandalf61 16:15, 12 April 2007 (UTC)

Ah! What you are talking about is the Poison Distribution by that French dude.


 * $$Pr(k,\lambda)=\frac{e^{-\lambda} \lambda^k}{k!},\,\!$$

Where $$\lambda = 4000 \,$$

and k is the number of people who has that rare event occuring to them. 202.168.50.40 22:54, 12 April 2007 (UTC)

Awesome, thank you all for the links, examples and explanations. I definitely see how I was confusing probability with causality in this case; led astray by my desire to believe that something 1 in a million happened to 4000 people daily. Obviously if an asteroid had calculated odds to hit the Earth in a set time period but didn't, you wouldn't then necessarily say its chance of hitting was actually zero, or 100% had it hit. 24.19.234.96 01:26, 13 April 2007 (UTC)

Whahaha? (horses and goats)
In the county fair there were 21 goats shown. This was 7 less than 4 times the number of horses shown. How many horses were shown? What the heck is this suppose to mean?

My solutions:

$$H= 21 - 7 / 4$$

$$H= 3.5$$

-

$$H= 21 / 4 - 7$$

$$H= -1.75$$

3.5 horses?

−1.75 horses?

Please help because all awnsers I get don't make sense. —The preceding unsigned comment was added by 70.233.146.249 (talk) 23:50, 12 April 2007 (UTC).


 * G + 7 = 4H. You know the value of G.  Take it from there...  --LarryMac 23:54, 12 April 2007 (UTC)


 * Thanks a bunch the awnser makes a Lot more sense. —The preceding unsigned comment was added by 70.233.146.249 (talk) 00:18, 13 April 2007 (UTC).

A literal transcription from English to "math" would might be

G = (−7) + 4H

G = 4H − 7

Sometimes breaking things down literally helps. Root4(one) 02:50, 13 April 2007 (UTC)