Wikipedia:Reference desk/Archives/Mathematics/2012 March 4

= March 4 =

Harmonic mean of exponential variables
Is there a closed form for the harmonic mean of iid standard exponential variables? That is,
 * $$a_n = \mathbb{E}\left[\frac{n}{\sum_i\frac1{X_i}}\right] = n\int\frac{\exp\left(-\sum_ix_i\right)}{\sum_i\frac1{x_i}}\ dx_1\cdots dx_n$$

We can find that $$a_1 = 1,\ a_2=\frac23$$ and $$\lim_{n\to\infty}a_n=0$$. But I don't see an obvious way to evaluate it for larger n, and even for $$n=3$$ Mathematica chokes on trying to solve it either symbolically or numerically. It returns a numerical value of 0.54145701, which might not be correct to this precision, but this has no results in Plouffe's inverter. -- Meni Rosenfeld (talk) 10:39, 4 March 2012 (UTC)

Grouping in statistics
If you make a group of "crack addicts" + "academic philosophers" (just an example) and compare it to the average of the population, you'll just find that the first is top on many things like higher criminality, high unemployment and such. However, you do not end up with a meaningful group. You could have added the crack addicts to any group of people and screwed their average. Anyway, intuitively the whole thing is clearly wrong to me, but is there a name to this kind of mistake, when you mix up things like that? XPPaul (talk) 13:53, 4 March 2012 (UTC)


 * The addition of the crack group would be an example of a confounding variable that biases the studied results, see the Confounding article for details. --Modocc (talk) 20:55, 4 March 2012 (UTC)

weird result
let's put the pseudocode right up and center:

Pseudocode

Busts = 0 Repeat the following forever: ( Bankroll = 300 timesthrown = 0  while Bankroll > 0:      ( timesthrown += +1 throw 5-sided die, cases: 1 bankroll += +whatever (note) 2 bankroll += +whatever (note) 3 bankroll += +whatever (note) 4 bankroll += +whatever (note) 5 bankroll += -25 (Note: search/replace 'whatever' with a number such that the     mathematically expected value or average effect of the whole      throw long-term is/approaches 'bankroll += +0.1' - the first      four cases should not differ, i.e. 'whatever' has one value.) print "Busts:" Busts, " Bankroll:" bankroll, " (thrown " timesthrown " since given current bankroll)", newline ) Busts += +1 ) (End)

I don't want to include it so as not to lead you to any of the same mistakes that I may have made in my methodology, but I made a script that does the following: - Start with a 'bankroll' - Throw a five-sided die. One case is negative, and loses 25 from the bank roll. The other cases are positive, such that the expected value of the roll is in fact +0.1. So as not to affect you, I won't mention what I calculated the other four to reward. The rule is, make the whole bet have an expected value of +0.1. (In other words, have the same expected value as if each result paid off 0.1 equally. however, get this expected value with one of the results deducting -25 and the remaining four paying off enough to compensate to this expected value). - Now here's the bizarre thing. I gave it what I thought was a reasonable bankroll, around 400. That's enough for 8 16 losses, and I would have thought the +0.1 would be a strong enough positive expectation that you would climb out before swings made you go bust. But it did go bust. So I ran it again. And again it went bust. So I ran it again. Again, bust. - So I put a loop around it, that counted how many busts it had gotten to, resetting each time, while printing the current amount in its bankroll. Since it printed a line after each toss, I quickly saw my window scroll to the number of busts that it had taken, and a currently safe and slowly increasing bankroll. It was an ENORMOUS number of busts, like 27 or 80, all in a row, before it got to the current iteration that didn't bust yet. (After the current iteration got high enough, I stopped the program, assuming it wouldn't bust anymore, that it had climbed out of the danger zone due to variance. This is obviously true around 10k, if you watch the bankroll increasing. - I ran this whole thing again, and again it quickly got to 27 or 35 or whatever number of busts before the current one that it could continue on - I ran it again and again, and every time, it busted an ENORMOUS number of times in a ROW, like twenty to eighty, every time.  A low number of busts was a fluke.

All right then. So I decided the starting bankroll was too low. I tried with a higher one, one I thought was obviously incredibly high given the high expected value. Still I got dozens of busts in a row before it was on safe ground and could keep playing indefinitely.

I kept increasing and increasing the amount, but without mentioning how much I had to give it, it was a very large amount before I could run the program ten times and have it stay at 0 busts and climb to safe territory each time.

So, here is what I'm looking for. I would like someone to repeat my methodology, and get back to me with what they find to be the safe level of initial bankroll, such that running the meta program 20 times, it busts 0 times in all twenty cases (or maybe 1 time in one of the cases). Like: 0 busts (scrolling current bankroll). kill it and again: 0 busts (scrolling current bankroll). Kill it and again: 0 busts (scrolling current bankroll). Kill it and again: and so on, and if you do this twenty times, you get usually 0 busts and maybe the occasional one bust.

This is as opposed to what I thought was a reasonable starting bankroll, where taking an upward weighted random walk with the conditions I descrbied, what I saw was: 27 busts (scrolling current bankroll) kill it and 37 busts (scrolling current bankroll) kill it and 42 busts (scrolling current bankroll) etc

These busts are crazy. It means that starting with a bankroll that's enough to cover a huge number of losses, playing this heavily winning game will bust you 40 times in a row!!! (if at each iteration you refill the whole of the already very high bankroll, and just keep playing until you're busted again or on safe territory).

I'm not revealing my specific numbers here but would like someone to try to do the same thing, and see if you get this same result! If you could even explain this weird thing, that would be amazing. Thank you.

I'm just trying to be very readable above. I would like to see if you get the same results I do with my bankroll, and what results (number of busts before the bankroll is on solid footing) you get with different bankrolls, or the amount of bankroll you need not to risk so many busts in a row, but have some certainty of making it out with 0 busts. Thanks.

You will have to manually terminate this program whenever it has built a bankroll that is bust-proof in your opinion, this way you can watch how it increments, let it keep going until you're quite sure you see what's happening.

Please note that I don't have much of a background in statistics, but do vaguely get standard deviations and sigmas and confidence levels, so if you want to give me an answer in these terms it may be OK. But I am more interested in the pragmatic terms as described in the methodology above... Also an intuitive answer for "why" this happens. If a robot kept playing the above game for me a year then I got the results, but I had to stake the robot and couldn't refill the robot if it got busted, how much should I give it to have different levels of confidence that it wouldn't lose my money? (i.e. half a chance of losing my stake and getting nothing, 20%, 10%, etc. Obviously if the robot normally busts 30-80 times in a row, that's near certainty that I will lose my money despite the positive expectation of each throw, given that the robot just keeps playing.  So, I'd like to know what certainty I'd have of getting money at the end of the year instead of losing my stake for nothin', at different stake levels.  (Or, alternatively, the required stake level to have different percent confidences that the robot won't squander it by random-walking down to to zero.) Thanks.  --80.99.254.208 (talk) 18:59, 4 March 2012 (UTC)


 * I do realize that over an "infinite" number of throws everyone busts (even an upwardly weighted random walk, however highly rated, will cross the zero from any initial value given infinite time). That's why I specified one year, this is a pragmatic question (hence the script) rather than a philosophical one. The timesthrown should be up to on the order of hundreds of thousands or millions or tens of millions or billions - not infinte. I think we can all agree that if the robot has made a billion dollars since it got its bankroll, it won't bust anymore, despite the theoretical possibility and even certainty - probability 1 - over infinite time.  My questions are, by contrast, extremely practical.


 * You should consider a simpler scenario. For example, have only two equally likely outcomes one which is +$25.2 and one that is -$25.  After 1000 throws, such a system busts if you lose 510 times or more.  That's actually quite likely.  Or similarly, 5028 losses in 10000 throws.  With a positive expectation of only $0.1 versus a value per throw of about +/-25, you don't grow away from zero very quickly, so random fluctuations that are only slightly more tilted to the loss side can easily do you in.  Dragons flight (talk) 20:15, 4 March 2012 (UTC)
 * I don't think that's equivalent at all, because 4 out of 5 outcomes pay off! It can't possibly be the same effect there as when one out of two outcomes pay off... --80.99.254.208 (talk) 20:25, 4 March 2012 (UTC)
 * Fine, 20% of the time you lose -$25, and 80% of the time you gain $6.375 (net expectation +$0.1). Then in a game of 1000 throws, you bust if you lose 216 times or more (21.6% on an expected loss rate of 20%).  Similarly, with 10000 throws, you would bust if you lose 2041 times or more.  Losing 2041 times, when you expect to lose 2000 times on average is not so unlikely as to be rare.  Dragons flight (talk) 21:21, 4 March 2012 (UTC)
 * In the simple 80 / 20 game, the odds of busting before you get to 10000 throws is roughly 54% even though the expectation is positive and you started with $400. The actual rate of busting in your more complicated game would be influenced to some degree by the size of the prizes in the four categories.  For example if the outcomes were 20%: +$25.5, 20%: -$25, and 60%: $0, then the expectation is still +$0.1, but the rate of busting during 10000 throws grows to 65%.  Dragons flight (talk) 21:39, 4 March 2012 (UTC)


 * (ec) Not an answer, but for a discussion of more extreme behaviour of this kind see St. Petersburg paradox. — Quondum☏✎ 20:18, 4 March 2012 (UTC)

Am I missing something here? If I had a 54% chance of busting, I wouldn't bust SIXTY TIMES IN A ROW before I made it out to where I had a vanishingly small chance of busting. Can someone repeat my experiment? --80.99.254.208 (talk) 08:57, 5 March 2012 (UTC)
 * I'm afraid I haven't the actual mechanics of what's going on, but you should be aware that many systems have a libc with a terrible implementation of  that is prone to large non-random effects, depending on how you slice the result. Many implementations of interpreted languages simply delegate to libc's , and inherit its behavior. What language (and implementation, and random number library, if applicable) are you using? Often, there's a better psuedorandom number generator available, typically based on the Mersenne twister. Paul (Stansifer) 15:11, 5 March 2012 (UTC)
 * Looking at the actual problem under discussion, though, I think you may be underestimating how easy it is to cover distance in a 1D random walk. According to the article, the expected distance covered is proportional to the square root of the time taken. So if the number of iterations is large compared to the square of the starting advantage, you should expect to bust reliably. How many iterations are you talking about? It's not possible to start computing anything without knowing that. Paul (Stansifer) 20:16, 5 March 2012 (UTC)
 * Note, this was sorta cross posted at WP:C. Shadowjams (talk) 21:48, 5 March 2012 (UTC)


 * Just a guess, but... If you are using a language like C or C++, you probably used the system clock as a seed, which only has second precision, so if you call it 5 times in a second, all 5 times will give the same number. If so, you should seed your random generator outside of the loop. KyuubiSeal (talk) 20:03, 6 March 2012 (UTC)

calculating the value of 'whatever'
is this correct: 0.2 * x + 0.2 * x + 0.2 * x + 0.2 * x + 0.2 * ⁻25 = 1.1

0.2x + 0.2x + 0.2x + 0.2x + -5 = 1.1

4 * 0.2x - 5 = 1.1

0.8x - 5 = 1.1 0.8x = 6.1 x = 6.1 / 0.8 = 7.62500

Thus the value of whatever should be "7.625" and not six-something as calculated by another poster. Who is correct?--78.92.82.6 (talk) 11:10, 5 March 2012 (UTC)


 * another way to do it: four whatevers and minus 25 all divided by five equals 1.1, i.e: (4x - 25)/5=1.1 thus 4x-25=5.5 thus 4x=30.5 thus x=7.6250.... 78.92.82.6 (talk) 11:12, 5 March 2012 (UTC)


 * That's correct for an expected value of +$1.1, your earlier post specified +$0.1. Dragons flight (talk) 15:47, 5 March 2012 (UTC)