User talk:Gill110951/Archive 4

From MHP to TEP
I am sure that you have both (Richard and Nijdam) heard the statement that every statistic is the answer to a question, the problem is, which question? Once you have a clearly-defined, well-posed question the answer is much easier. Whitaker's question is not such a question.Martin Hogbin (talk) 08:38, 8 July 2011 (UTC)
 * And 80% of statistics are just made up. Seriously, you hit the nail on the head, Martin. And MHP is *both* a popular brain-teaser *and* a popular vehicle for teaching conditional probability in the probability class, *and* a splendid vehicle for teaching game theory, too. It challenges our conceptions of probability and it challenges the "experts" in communicating with ordinary folk. And that's why it is so fascinating. Richard Gill (talk) 19:19, 8 July 2011 (UTC)
 * And that is why there is a logical separation of the MHP is into two parts. The first covering the popular brain teaser, without too much fretting about detail, and the second the vehicle for teaching and learning, in which all the bases are covered. Martin Hogbin (talk) 08:48, 11 July 2011 (UTC)
 * Yes! Right now I am enjoying two envelopes problem where we have the same dual nature and hence the same problems (together with the problem of a possessive editor). And, enjoying the latest discoveries of Sasha Gnedin on MHP. Now this really is amazing, that some completely new mathematics could be done on this simple and old problem. It is really brilliant. Richard Gill (talk) 13:48, 11 July 2011 (UTC)
 * There are strong similarities between the two articles. I am trying to present the subject matter in a way that makes it accessible to most people. Of course, we want the mathematical detail as well.


 * Many people will only find the TEP interesting because it has some relation to reality. They imagine being perpetually perplexed on being presented with two envelopes. For these people anything that in impractical is purely theoretical.  The problem could be presented to mathematicians in a purely abstract way but most people would find such a problem uninteresting.  Martin Hogbin (talk) 08:42, 13 July 2011 (UTC)


 * Most people *should* find TEP uninteresting! It is a conundrum about logic. It is only interesting if you are interested in logic and possibly also in semantics and mathematics. There is no vast popular literature on it. There is only a vast technical literature. And interestingly, there is almost no *secondary* or *tertiary* literature on TEP. It is almost all purely research articles, each one promoting the author's more or less original point of view, and each one criticising earlier "solutions". And so it goes on. The three papers which cite those two young US philosophy PhD's do so in order to criticize their solution, and to propose an alternative. According to wikipedia guidelines on reliable sources, the article on TEP should be very very brief and just reproduce the comments in a couple of standard (undergraduate) textbooks on TEP. EG David Cox's remarks in his book on inference. I have no idea if there is a standard philosophy undergraduate text which mentions TEP. Our friend iNic hasn't mentioned one. We should take a look at other encyclopedia articles on TEP. I think I will write one for StatProb, and then wikipedia editors can use it. Survey papers are admissable secondary sources for wikipedia provided they do not promote the author's own research. They are a primary source for the latter. Richard Gill (talk) 16:14, 13 July 2011 (UTC)


 * Ordinary people won't be perplexed. They know by symmetry that switching is OK but a waste of time (if you don't open your envelope). They don't really understand probability calculations anyway, so they know there is something wrong with the argument, but don't care what. Regarding Smullyan's version, they also know the answer (it doesn't matter whether you switch or not) so they know which argument of Smullyan's is correct. As writer after writer has stated, the problem of both original TEP and of TEP without probability is using the same symbol (original) or the same words (Smullyan) to denote two different things. It's a stupid problem and has a simple resolution. Well, and if we are allowed to look in our envelope, then everything is different. But no longer very interesting for laypersons. It turns out to be a fact of life that there are "theoretical" situations (but I can simulate them on a computer for you, they are not that theoretical!) where according to conditional expectation value you should switch, whatever. OK, and this is just a fact about a random variable with infinite expectation value: if I give you X you'll always be disappointed compared to its expectation. But to get its expectation I would have to give you the average of millions and millions of copies of X. Eventually the average will be so large that you'll prefer the swap whatever the value of the X I gave you first. Then there are all kinds of nice calculations about whether or not you should switch given a prior distribution of X and there are cute things about what to do if you don't want to trust a prior ... then you should randomize. It's all rather technical, isn't it. Only interesting for specialists. By the way this is *not* theoretical since I can approximate a distribution with infinite expectation with one with very very very large finite expectation. I can create approximately the same paradox without actually using infinite values. Syverson does that. It's very technical and hardly interesting for ordinary folk. Richard Gill (talk) 16:26, 13 July 2011 (UTC)

Two Envelopes Problem - open letter to Schwitzgebel and Dever
Dear Eric Schwitzgebel and Jos Dever

I am very puzzled by the logic of your claimed resolution of the two envelopes problem, published in Sorites (2008).

You essentially claim to be the first ones, so far, to have exposed the flaw in the original reasoning of original TEP. For instance others "have advocated constraints on the use of variables in the expectation formula that ... are considerably more restrictive than necessary". And so on.

Now as a mathematician (in fact a mathematical statistician and probabilist) it is clear to me that the *first* error in the reasoning in TEP is the claim that because our envelope is equally likely to contain the smaller or the higher amount, the same should also be true "whatever is in our envelope". Calling the amounts in the two envelopes A and B, and the smaller and larger amounts X and Y, we are told that Y=2X>0, and that (A,B)=(X,Y) or (Y,X) *independently* of X (and the two possibilities are equally likely). It does not follow logically from this, that given A=a, B is equally likely to be a/2 or 2a. Because otherwise logic and probability theory would be in conflict. But probability theory is built in (or on) mathematics, and mathematics is built in (or on) logic. And there are a hundred and one different ways to see that for any random variables X, Y, A and B satisfying Y=2X>0 and (X,Y)=(A,B) or (B,A), then it is impossible to have simultaneously "A=X or 2X with probability half, independently of X", and "B=2A or A/2 with probability half, independently of A".

For ordinary folk perhaps the easiest way to explain this is to mention that if there is a maximum, say m, to the (smaller) amount of money X, then if A=2m we know for sure that B=A/2! B isn't equally likely A/2 or 2A, whatever the value of A. Mathematical analysis just shows that boundedness of X isn't a necessary condition for the impossibility theorem.

You could say, as philosophers, that TEP is a problem in philosophy or logic, and that there is no obligation to try to translate probability-like reasoning into standard (post Kolmogorov) probability theory. A philosopher can call into question, whether the standard (elementary) probability theory of present-day mathematics is indeed a faithful representation of logically correct probability-like reasoning in common discourse. But as a mathematician I can answer that the very reason why modern probability theory was developed was to free us from logical paradoxes resulting from lack of clarity about what are the rules of probability. But more strongly, I can also object that your own "solution" of TEP relies on embedding the problem in classical probability theory! So you do not have recourse to the claim that you are solving TEP at a pre-formal-probability-theory level.

What you actually do in your paper is, namely, the following. You establish a little theorem (Section: "The Proof") of classical probability theory. Its conditions are satisfied by a number of TEP-like problems for which intuitive (pre-probability theory) reasoning gives the right answer. Its conclusion is this "right answer". Its conditions are not satisfied by TEP itself (obviously, they could not be satisfied if your theorem is true, since for TEP, intuitive reasoning gives an answer which is wrong: we know in advance that though exchanging closed envelopes is harmless, it does not improve our position).

Your theorem is not an "if and only if" theorem. It contains sufficient conditions, not necessary conditions, for its conclusion. Hence the failure of its conditions to be satisfied for TEP is not a logical explanation of the failure of the (intuitive, wrong) conclusion for TEP!

On the other hand, since your "solution" depends on embedding TEP within classical probability theory, how can you object to the solutions which have been previously found by that embedding? It is not for nothing that in modern probability theory we are very careful to distinguish a random variable from possible values it can take. We take special care to define E(Y|X) in a slightly round-about way: first define E(Y|X=x) by computing the conditional expectation of Y according to the conditional probability distribution of Y given X=x (for which we only need to know the usual definition of conditional probability); secondly define E(Y|X) as the same function of the random variable X, as E(Y|X=x) is of the mathematical variable x.

Don't philosophers know about this? Do they have an objection to this? Apparently not, since you are implicitly defining conditional expectation in the same way!

From my point of view, the paradox of TEP comes from using the same symbol X to denote two different things: the random variable itself, and a possible value of the random variable. And it comes from confusing conditional and unconditional probabilities, something it is easy to do if your notation is already ambiguous.

I realise it sounds arrogant but it seems clear to me that philosophers won't like to be told that TEP is only a problem for philosophers because most philosophers don't know probability theory, and that is the main reason why they spend a lot of time looking for "the real reason" behind the TEP paradox, but never succeed in finding it. They never will, because the real reason behind TEP is because the faulty reasoning contradicts the logic of probability theory.

Anyway, I will be delighted to hear your response to these comments.

We could consider converting an eventual email-exchange into a conversation on our blogs or websites. I think that could be useful to both mathematicians and philosophers who remain perplexed by the strange ways "the others" have of solving TEP. Would you be interested in pursuing that option?

Yours sincerely Richard Gill (talk) 18:39, 17 July 2011 (UTC)

http://www.math.leidenuniv.nl/~gill

Schwitzgebel and Dever's simple solution
These two authors also have a website containing what they call a simple version of their solution. In my opinion their simple solution is different to the version which they give in their paper. However the simple solution is essentially correct. It could be made even more simple!

The paradoxical argument of TEP correctly computes

E(B)=0.5 E(2A | X=A) + 0.5 E( A/2 | X=B)=E(A|X=A)+E(A |X=B)/4

but then procedes to substitute E(A|X=A)=E(A) and E(A|X=B)=E(A).

This would be correct if A were independent of the event X=A but no argument is given to justify that.

We would, on the contrary, expect E(A|X=A) < E(A) and E(A|X=B) > E(A): "you would expect less in Envelope A if you knew that it was the envelope with less than you would if you knew it was the envelope with more" (SD-simple).

But suppose for a moment we do have both these equalities. Now if E(A|X=A)=E(A) and E(A|X=B)=E(A) then we get E(B)=5 E(A)/4 whereas by symmetry and by a simple calculation, E(B)=E(A)=3E(X)/2. But it is possible that E(B)=E(A) and simultaneously E(B)=5 E(A)/4 ! We have, namely, the unique solution E(A)=E(B)=E(X)=infinity.

More directly, and in any case, we have E(A|X=A)=E(X|X=A)=E(X) and E(A|X=B)=E(2X|X=B)=2E(X). Thus E(A|X=B)=2E(A|X=A).

So in general one can say: you would expect twice as much in Envelope A if you knew it was the envelope with less, than you would if you knew it was the envelope with more!

Thus SD-simple's explanation of the *reason* for the paradox: "you would expect less in Envelope A if you knew that it was the envelope with less than you would if you knew it was the envelope with more" is completely correct. In fact, you would expect twice in Envelope A if you knew it was the envelope with less than if you knew it was the envelope with more. And simultaneously we have the converse, that the only way you could expect the same amount in both situations is when one (and hence all) expectations are infinite. Twice infinity is infinity, it is simultaneously more and the same.

In fact if you knew envelope A was the one with less then there's X in it. And whether X or 2X is in envelope A is independent of X. So you expect E(X) in it. Similarly if you knew it was the one with more then there's 2X in it, and we expect E(2X)=2E(X). Which is more, unless EX=2EX=infinity.

In fact as is well known in mathematics, given that X is independent of the event {X=A}, it simply is not possible to have the independence of A of the event {X=A}. But it is possible to get it to a very good approximation! And the closer this is to being true, the more the distribution of X is heavily skewed to the right; its expectation value gets larger and larger compared to where the bulk of the probability is. The expectation value gets less and less representative of typical values of X. Almost whatever you'll get, you'll be disappointed by getting X, if you compare it to E(X). Almost whatever is in envelope A, you'ld be disappointed when you compare it with the expected value in envelope B. Richard Gill (talk) 15:06, 18 July 2011 (UTC)

TEP, Anna Karenina, and the Aliens movie franchise
The Anna_Karenina_principle: "Happy families are all alike; every unhappy family is unhappy in its own way." Here it's the same: a logically sound reasoning can be logically sound in only one way, but a reasoning which leads to a "paradox" can be wrong in many.

Schwitzgebel and Dever not only published a paper in the philosophy literature, but also have a little web page with a simple solution to TEP,. Let me summarize their simple solution here. It's quite good. First let me repeat the original TEP paradox, I'll call it TEP-I:

TEP-I

 * 1) I denote by A the amount in my selected envelope.
 * 2) The probability that A is the smaller amount is 1/2, and that it is the larger amount is also 1/2.
 * 3) The other envelope may contain either 2A or A/2.
 * 4) If A is the smaller amount the other envelope contains 2A.
 * 5) If A is the larger amount the other envelope contains A/2.
 * 6) Thus the other envelope contains 2A with probability 1/2 and A/2 with probability 1/2.
 * 7) So the expected value of the money in the other envelope is (1/2) 2A + (1/2)(A/2) = 5A/4.
 * 8) This is greater than A, so I gain on average by swapping.
 * 9) After the switch, I can denote that content by B and reason in exactly the same manner as above.
 * 10) I will conclude that the most rational thing to do is to swap back again.
 * 11) To be rational, I will thus end up swapping envelopes indefinitely.
 * 12) As it seems more rational to open just any envelope than to swap indefinitely, we have a contradiction.

For a mathematician it helps to introduce some more notation. I'll refer to the envelopes as A and B, and the amounts in them as A and B. Let me introduce X to stand for the smaller of the two amounts and Y to stand for the larger. I think of all four as being random variables; but this includes the situation that we think of X and Y as being two fixed though unknown amounts of money. It is given that Y=2X>0 and that (A,B)=(X,Y) or (Y,X). The assumption that the envelopes are indistinguishable and closed at the outset translates into the probability theory as the assumption that the event {A=X} has probability 1/2, whatever the amount X; in other words, the random variable X and the event {A=X} are independent.

Schwitzgebel and Dever seem to assume that that steps 6 and 7 together are intended to form a computation of E(B).

In that case, the mathematical rule about expectation values which is being used in step 7 is
 * E(B)=P(A=X)E(B|A=X)+P(B=X)E(B|B=X).

This means that we can get the average value of B by averaging over the two complementary situations that A is the larger of A and B and that it is the smaller, and then weighing those two so-called conditional averages according to the probabilities of the two situations. Now the two situations have equal probability 1/2, as mentioned in step 6, and those probabilities are substituted, correctly, in step 7. However according to the this interpretation, the two conditional expectations are screwed up. A correct computation of E(B|A=X) is the following: conditional on A=X, B is identical to 2X, so we have to compute E(2X|A=X)=2 E(X|A=X). We are told that whether or not envelope A contains the smaller amount X is independent of the amounts X and 2X, so E(X|A=X)=E(X). Similarly we find E(B|B=X)=E(X|B=X)=E(X).

Thus the expected values of the amount of money in envelope B are 2E(X) and E(X) in the two situations that it contains the larger and the smaller amount. The overall average is (1/2)2E(X)+(1/2)E(X)=(3/2)E(X). Similarly this is the expected amount in envelope A.

Schwitzgebel and Dever's executive summary is "what has gone wrong is that the expected amount in the second envelope given it's the larger of the two is larger than the expected amount in the second envelope given it's the smaller of the two''. Indeed. It's rather obviously twice as large!

This is perfectly correct, and very intuitive. But it's not the only thing that goes wrong, in that case. They take no note at all of the fact that the writer finishes with a solution expressed in terms of A, not in terms of E(A). Their explanation of what went wrong is seriously incomplete.

However there is another way to interpret the intention of the writer of steps 6 and 7, and it is also very common in the literature.

Since the answers are expressed in terms of the amount in envelope A, it also seems reasonable to suppose that the writer intended to compute E(B|A). This conditional expectation can be computed just as the ordinary expectation, by averaging over two situations. The mathematical rule which is being used is then
 * E(B|A)=P(A=X|A)E(B|A=X,A)+P(B=X|A)E(B|B=X,A).

In step 7 the writer correctly sustitutes E(B|A=X,A)=E(2X|A=X,A)=E(2A|A=X,A)=2A and similarly E(B|B=X,A)=A. But he also takes P(A=X|A)=1/2 and P(B=X|A)=1/2, that is to say, the writer assumes that the probability that the first envelope is the smaller or the larger doesn't depend on how much is in it. But it obviously could do! For instance if the amount of money is bounded then sometimes one can tell for sure whether A contains the larger or smaller amount from knowing how much is in it.

Note that some of the literature focusses on the first explanation (the expected value of the amount in the other envelope is different when it is the larger, than when it is the smaller) while some of the literature focusses on the second explanation (the chance that the amount in the first envelope is the larger cannot be assumed to be independent of the amount in that envelope). The original writer was simply mixed up between ordinary expectations and conditional expectations. He is computing an expectation by taking the weighted average of the expectations in two different situations. Either he gets the expectations right but the weights wrong, or the weights right but the expectations wrong. Who is to say? Only one thing is clear: the writer is mixed up and he is using the same symbol to denote different things! Depending on what we imagine he is really trying to do, we can give different analyses of what are the things he is giving the same name, which are actually different. Is he confusing random variables and possible values they can take? Or conditional expectations and unconditional expectations? Who's to say?

Anyway, this suggests to me that original TEP does not deserve to be called a paradox (and certainly not an unresolved paradox): it is merely an example of a screwed-up calculation where the writer is not even clear what he is trying to calculate, hence there are different ways to correct his deriviation, depending on what you think he is trying to do. The mathematics being used appears to be elementary probability theory, but whatever the writer is intending to do, he is breaking the standard, elementary rules. Whether we call "explaining" the "paradox" an exercise in logic or in mathematics is unimportant. Whether it is performed by philosophers, mathematicians or logicians is irrelevant too. It seems to me that original TEP belongs to elementary probability theory.

The idea that there should be a unique solution to this paradox is a wrong idea. The paradox is not a paradox, it's a mess. There are a lot of ways to clean up a mess. Richard Gill (talk) 16:51, 18 July 2011 (UTC)


 * The job is not to clean up the mess, that is easy (and can be accomplished much faster than what you have done above). Instead, the job is to pinpoint the erroneous step in the presented reasoning, and to be able to say exactly why that step is not correct and under what conditions it's not correct. And of course when it's correct. We want to be absolutely sure we won't make this mistake again in a more complicated situation where the fact that it's wrong isn't this obvious. In other words, a correct solution should present explicit guidelines on how to avoid this trap once and for all, in any imaginable situation. This is the true two envelope problem, and it is still unsolved. S&D have at least understood this--what TEP is all about--while you together with some of the authors of TEP papers haven't. iNic (talk) 01:26, 19 July 2011 (UTC)


 * The job is to say which if any steps are wrong. There are several wrong steps. They have all been identified in the past. S&D claim no novelty in their results, only novelty in their focus. No mistake could have been made if the author had distinguished random variables from their outcomes, probabilities from conditional probabilities, expectations from conditional expectations. And if he'ld known the standard rules for computing expectations. So there is no danger of making the same mistake in more complex situations. Philosophers do not do actual calculations for hard real problems. Those are left to the professionals, who do their calculations professionally. Philosophy has a big problem as long as it ignores the fact that standard probability calculus and standard notation and standard concepts were introduced precisely in order to avoid such mistakes (Kolmogorov, 1933, solving Hilbert's so-manyth problem, to provide an axiomatic mathematical foundation for probability theory). TEP is not a paradox, it's a problem for the student. A finger-exercise for the beginner in probability theory! Anyway, I'm having a nice correspondence with S&D at the moment as well as with two other philosophers / logicians and with several mathematicians. I think some nice new results are coming out but obviously they are not for Wikipedia for the time being. Richard Gill (talk) 01:45, 19 July 2011 (UTC)


 * OK but I don't get it. If "TEP is not a paradox" and only "a problem for the student" and a "finger-exercise for the beginner in probability theory," why on earth are you, four philosophers and several mathematicians writing on some new results about TEP? You are bringing even more contradictions to the table. iNic (talk) 02:07, 19 July 2011 (UTC)


 * Because it's what I'm paid for, and because it's fun. You should think of TEP, or rather TEP-1, as a kind of joke. After all jokes are built on getting a surprise. Then new people who like that joke and are creative, create new jokes in similar spirit. The whole TEP franchise should be seen as a running gag. Richard Gill (talk) 14:49, 19 July 2011 (UTC)


 * I have never thought of TEP as a joke, but it never surprised me either. What have surprised me, however, are all the crazy and contradicting ideas people have come up with to try to explain what is surprising to them. They are often quite amusing. I look forward to read your forthcoming paper! iNic (talk) 23:59, 19 July 2011 (UTC)


 * And I look forward to hearing your comments and criticisms! I'll be posting something it on my home page soon, and I'll put a mention of it on the TEP talk page. Richard Gill (talk) 01:50, 20 July 2011 (UTC)

TEP-2
Just like a great movie, the success of TEP led to several sequels and to a prequel, so nowadays when we talk about TEP we have to make clear whether we mean the original movie TEP-I or the whole franchise.

The first sequel was based on the discovery that there exist distributions of X such that E(B|A)>A. Necessarily, these distributions have to have E(X) infinite. The resolution of TEP-2 is that you will always disappointed when you get X if you were expecting E(X). In practical terms, you don't expect the expected value at all.

Going back to TEP-I, obviously the expected value of what is in the second envelope is bigger when it's the larger of the two, than when it's the smaller of the two. But at the same time it's the same if all these expected values are infinity. One call this decision theory if one likes, but to me this is just more elementary probability.

Actually there is a kind of "return of TEP-1" side to this part of the story. Maybe the original writer of TEP was a Bayesian and was representing his ignorance about X using the improper prior distribution with density (proportional to) 1/x. Argument behind this: we know nothing at all about the smaller amount x except x>0. Therefore we also know nothing about 1 billion times x, or x divided by pi ... in general, we know nothing about y=cx, for any c positive. So the distribution of X which represents our uncertainty must be unaltered when we transform it to the distribution of cX. There is only one such distribution. Anyway, the remarkable feature about this distribution is that it is now the case that given A=a, B is equally likely equal to a/2 or 2a, so it is indeed the case that E(B|A=a)=5a/4, or E(B|A)=5A/4>A. The expectation value does tell us to exchange envelopes. But this is not a paradox: what is in the other envelope has expectation value infinity. Whether we finally get X or 2X, we'll be disappointed when we compare it to its expectation value.

Improper distributions are a bit tricky but all these things are approximately true if we restrict to the amount x to any range say from some tiny epsilon (positive) to some enormous M, and take the density proportional to 1/x on this range. Our knowledge about x is roughly the same as that about cx provided c is not extremely large or small. The expectation value of X is close to M but X itself is almost never as large as its expectation. For most a, given A=a, B is equally likely a/2 or 2a.

Seen this way the paradox is not really a paradox about infinity, since one can see the paradoxical things happening without actually going all the way to infinity.

Also this story shows that the writer of TEP-1 need not actually have been making a mistake in steps 6 and 7 - he was a subjectivist using the completely logical and conventional but improper prior 1/x to represent *complete* ignorance about x>0. The "wrong" step was step 8 - he jumped to the conclusion that whether or not you should exchange can be decided by looking at expectation values. But if your ignorance about x is complete, its expected value according to your beliefs is infinite. And in that case you don't expect the expectation value. Richard Gill (talk) 15:04, 19 July 2011 (UTC)

TEP-3
Next we start analysing the situation when we do look in envelope A before deciding whether to switch or stay. If there is a given probability distribution of X this just becomes an exercise in Bayesian probability calculations. Typically there is a threshhold value above which we do not switch. But all kinds of strange things can happen. If a probability distribution of X is not given we come to the randomized solution of Cover, where we compare A to a random "probe" of our own choosing. More probability. Richard Gill (talk) 17:12, 18 July 2011 (UTC)

TEP-Prequel
This is of course Smullyan's "TEP without probability". The short resolution is simply: the problem is using the same words to describe different things. But different resolutions are possible depending on what one thinks was the intention of the writer. One can try to embed the argument(s) into counterfactual reasoning. Or one also can point out that the key information that envelope A is chosen at random is not being used in Smullyan's arguments. So this is a problem in logic and this time an example of screwed up logic. There are lots of ways to clean up this particular mess. Richard Gill (talk) 17:12, 18 July 2011 (UTC)

Conclusions
The idea that TEP is not solved, that the "paradox" is still not resolved, is wrong. There are a number of variant problems and each problem has a number of solutions. The different solutions to the same problem do not contradict one another. Original TEP is an example of a chain of reasoning which leads to an obviously false conclusion, hence there must be something wrong with the reasoning. The problem is to identify what is wrong. But there are several things wrong, and there are even several levels at which mistakes are being made. As soon as you have identified any of these mistakes you have given "a" solution. Original TEP was solved long ago, and then on the ruins new paradoxes were invented. There is no reason why that process should ever come to an end, either. It does not mean that TEP is *controversial*. It means that it is fruitful, stimulating. Richard Gill (talk) 01:29, 19 July 2011 (UTC)


 * It would be very interesting if you could show (in a published paper) how all the different ideas presented so far are logically equivalent. But until we have such a result we must confess that the proposed solutions are very different in character indeed. iNic (talk) 11:24, 20 July 2011 (UTC)


 * The different ideas here are *not* logically equivalent. That is the whole point. We are given what appears a sequence of logical steps. But if it is logic (or mathematics) it is informal logic (or informal mathematics). Definitions are not given. Hence assumptions are only implicit. The writer does not say at each step which theorem of probability theory, or rule of logic, is being used. The context is missing. Hence one can give a different diagnosis of "what went wrong" by construing different intentions (background assumptions...) on the part of the writer. Different contexts. So indeed: the proposed solution are different in character, and they correspond to philosophers, practical statisticians, Bayesian fundamentalists or whoever (educationalists, cognitive scientists, economists, ...), looking for a context for the argument, each supposing it to be the context most familiar to themselves and their intended readers, and showing "what goes wrong" if one takes TEP as an argument intended to take place in "their" context.

One should compare TEP to the Aliens movie franchise, where each successive movie had a different director who each brought a very personal touch to their take on the basic story line. Richard Gill (talk) 13:56, 20 July 2011 (UTC)


 * iNic, are you saying that you believe that the TEP is an open problem, that is to say that the paradox has not been fully resolved. If you want to support this assertion, you must come up with a precise definition of the problem that results in a genuinely paradoxical situation (this comment placed by Marting Hogbin?).

Richard - you mentioned we are welcome to join your dropbox - can you let me know how to do this? Dilaudid (talk) 07:35, 27 July 2011 (UTC)


 * Send me an email so I have your email address, and I will "invite" you to share the folder. I have a gmail account, name gill1109. Via my university home page you can find some alternative addresses. Richard Gill (talk) 15:28, 27 July 2011 (UTC)


 * Richard, I have no clue, and didn't watch ... Just one simple question: Seeing that: $$B = {5 \over 4}A$$,  what likewise means:  $$A = {5 \over 4}B$$,    I can see that both of that must be incorrect.
 * I have no idea where the fault could be located, but obviously something (what?) is simply wrong. Question: Makes it sense to just avoid that dilemma in saying: There are two envelopes: X and 2X. Not knowing which one I got, I should assume that I'm likely to hold the average of  $${3 \over 2}X$$.


 * And in the case that I should change, I will get either X or 2X:  $${1 \over 2} X + {1 \over 2} 2X = {3 \over

2}X$$, exactly the same amount that I am already supposed to hold. So no incentive to change the envelope. But I am sure that this is no news at all, and I know this never is helpful to explain the "fault", so I wish you good luck in locating the (obvious) error.


 * But I really fear indeed that the theorem  $${1 \over 2} 2A + {1 \over 2} {A \over 2} = {5 \over 4}A$$  is formulated sloppy. It lacks the condition under which an event is given. I either "have the small amount",  or " I have it twice".
 * 2A  is only correct if I possess the envelope with the small amount,  while $${A \over 2}$$ is only correct if I am in possession of the envelope with "twice the small amount". Can I "say" that within the theorem?  Please excuse my throwing my peanut.  Kind regards,  Gerhardvalentin (talk) 12:24, 29 July 2011 (UTC)
 * A=5B/4 and B= 5A/4 has three solutions: A=B=0 and A=B=+infty and A=B=-infty. In this case since we're told they're positive, +infty. You are of course talking about expectation values, not actual values. The writer is so sloppy there are at least three ways to try to make sense of his argument, none of them resulting in "switch". Richard Gill (talk) 12:42, 29 July 2011 (UTC)

Seriously, the favoured interpretation of most probabilists is the following.

Since the answers are expressed in terms of the amount in envelope A, it also seems reasonable to suppose that the writer intended to compute E(B|A). This conditional expectation can be computed just as an ordinary expectation, by averaging over two complentary situations. The mathematical rule which is being used is then
 * E(B|A)=P(A=X|A)E(B|A=X,A)+P(B=X|A)E(B|B=X,A).

Note the probability weights are both conditional probabilities. We condition throughout on A=a. The two conditional probabilities and two conditional expectations on the right hand side all, in general, depend on a. The one on the left, too. The notation E(B|A) is shorthand for: compute E(B|A=a), think of it as a function of a, and then take this same function of the random variable A. In words: the expect#d value of B given that A takes on the value it actually does have.

In step 7 the writer correctly sustitutes E(B|A=X,A)=E(2X|A=X,A)=E(2A|A=X,A)=2A and similarly E(B|B=X,A)=A. But he also takes P(A=X|A)=1/2 and P(B=X|A)=1/2, that is to say, the writer assumes that the probability that the first envelope is the smaller or the larger doesn't depend on how much is in it. But it obviously could do! For instance if the amount of money is bounded then sometimes one can tell for sure whether A contains the larger or smaller amount from knowing how much is in it.

Just like MHP, it helps to have followed a first course in orobability theory. This provides the concepts (the distinctions) needed to u derstand the problem, as well as a common language to discuss about, and work on, the problem. A language invented precisely with the purpose of clearing up this kind of logical mess, so as to be able to move on to serious problems. Which we probabilists did, leaving the philosophers behind in the 19th century. Richard Gill (talk) 13:47, 29 July 2011 (UTC)


 * Thank you, and please don't reply to this:
 * One envelope contains 1, the other one 2. Both are equally likely to contain the average of 1.5, say 0.5 difference to the smaller amount resp. same 0.5 absolute difference to the bigger amount.
 * "To doubling or to halving your amount" pays no regard to this view. Your envelope is likely to hold 1.5 – so no doubling of 1.5 nor halving of 1.5 can be expected.
 * By changing the envelope you will gain 0.5 or you will loose the same absolute amount of 0.5 and are expected to still have 1.5 then, the same as you had before. I would like the theorem to express this fact if possible, and to show this result. Please don't reply, you see more than me. Regards,  Gerhardvalentin (talk) 19:33, 30 July 2011 (UTC)

Of course. Let's calculate the expected amount in the second envelope. We don't condition on A. (There would only be a point in conditioning on A if we looked in the envelope and could use the observed value A=a to inform our decision). Now we use E(B)=P(A=X)E(B|A=X)+P(B=X)E(B|B=X). Next we make the substitutions P(A=X)=P(B=X)=0.5, and B=2X when A=X, while B=X when B=X, getting: E(B)=0.5E(2X|A=X)+0.5E(X|B=X). Next we observe that X (the smaller of the two amounts) is independent of which envelope gets the smaller amount. This gives us: E(B)=0.5E(2X)+0.5E(X) and hence E(B)=1.5E(X). No, it is not difficult. It is called a paradox by people who do not know probability calculus. Which was invented in order to avoid this kind of mess in the future. Richard Gill (talk) 06:55, 31 July 2011 (UTC)

Theorem: (1/2)2A + (1/2)(A/2) = 5/4A  to me looks like a sloppy formulation. Question: "What exactly is A"? – And I said in contrast: Theorem: (1/2)2X + (1/2)X = (3/2)X. Either I hold the SMALL amount X or I hold the BIG amount of 2X. Having only X, I will double it to 2X, and if I already hold 2X then I will halve it to X. You can suppose to already hold this average of 3/2, and by changing you will get the average of 3/2 again. For me, this makes sense.
 * please pardon, Richard, once again: Above I said that the

and I read "5/4" of some (just for me) obviously undefined "A" (what is "A"?). I miss the said restraints. But maybe, they're not needed anyway, I have no clue. Kind regards, Gerhardvalentin (talk) 14:14, 3 August 2011 (UTC)
 * But for me, it makes no sense to say "A is the small amount" that could be doubled, "but at the same time A is the BIG amount" that could be halved. I do not understand that "A" can represent the small amount and likewise the opposite, the BIG amount, at the same time.
 * Above I said:  2A is only correct if I possess the envelope with the small amount,  while (A/2) is only correct if I am in possession of the envelope with "twice the small amount". –  And I asked: Can I "say" that within the theorem?"  –  I meant:
 * Theorem:  (1/2)2A [restraint: in case only that "A" represents the SMALL amount, but never "twice the small amount"] + (1/2)(A/2) [restraint: just only in case that "A" represents the BIG amount but never the small amount] = 5/4A


 * I don't understand you. A is the amount in the first envelope. It is random, because it is created as follows: a positive amount X is drawn from some probability distribution (known or unknown, degenerate or not - I don't care; and I don't care whether the distribution of X is supposed to be interpreted as subjectivist personal belief or a frequentist physical model). Independently of this, we toss a fair coin. Heads: A,B = X,2X. Tails: A,B = 2X,X. Now it seems that at Step 6 the writer is supposing that, given A=a, whatever a may be, A=X or A=2X each with (conditional) probability 1/2. That is to say: not only is the coin toss independent of X, but also A is independent of the coin toss. If so, then the conclusion E(B|A=a)=5a/4, whatever a might be, is justified. Averaging over a we get EB=5EA/4. This appears in contradiction with the symmetry of the problem, but it isn't, because the *extra* assumption that A is independent of the coin toss implies that the integer part of log_2 X is uniform on the set if all integers. And that implies EX=infinity. So it all ties together perfectly. Either the writer is confusing a conditional and an unconditional probability at Step 6, or the writer is assuming total ignorance of X>0, expressed probabilistically as log X is uniformly distributed over the whole real line. In the first case, the his conclusion EB|A=a)=5a/4 cannot be true for all a, so there is not an argument for switching without looking at the value of A, or his conclusion is correct but then EA=EB=infty and expectation values are no guide to decision making - you'll always be disappointed. Richard Gill (talk) 15:43, 3 August 2011 (UTC)
 * Thank you, and you are right. Item 6 says
 * Thus the other envelope contains 2A with probability 1/2 and A/2 with probability 1/2
 * and – for my perceptivity – by ignoring that the one is solely possible if A is the smaller amount, whereas the other can only be true if A is the greater amount – accepting herewith as a consequence that the given size ratio is no more 1:2 resp. 2:1,  but unfounded is pretended to be 1:4 now resp. 4:1.  Gerhardvalentin (talk) 19:40, 3 August 2011 (UTC)
 * 0.5 x 2a + 0.5 x a/2 = 5a/4. This is the expected amount in Envelope B, if Envelooe A contains a, and B is equally likely twice or half a, given that Envelooe A contains a. For instance: let the smaller amount of money X be uniformly distributed on all integer powers of 2 from -M to +N, where M and N are very large numbers. Next, let A= X or 2X with equal probabilities, independently of X, simultaneously B = 2x or X. Then given that A=a=2^k, B=2a or a/2 with equal probability, except for the extreme cases k=-M and k=N+1. Richard Gill (talk) 18:09, 5 August 2011 (UTC)

TEP Synthesis
The following discussion creates a synthesis between the solution of Schwitzgebel and Dever, representing the philosophers, and the solution of myself, representing the probabilists. We need to be aware of the symmetry of (statistical) dependence and independence. In particular, random variable A is independent of event {A<B} if and only if event {A<B} is independent of random variable A.

S&D's diagnosis of what goes wrong was that the amount A in the first envelope is different, namely smaller, when it is the smaller of the two ("equivocation"). I strengthened this to the observation that on average, it's twice as small as what it is, on average, when it is the larger.

This observation proves that the amount A is statistically dependent on the event {A<B}.

My diagnosis was that step 6 assumes that the fact whether or not A is smaller, is independent of the actual amount A. In other words: step 6 assumes that the event {A<B} is is independent of A.

By symmetry of statistical independence, this is the same thing as independence of A on {A<B}.

There's one tiny speck of dirt in these arguments. If E(X) is infinite then twice infinity is infinity, we can't necessarily conclude that what's in the first envelope is smaller, on average, when it's the smaller of the two. And there's an extreme case where our uncertainty about X is so large that whether or not it's twice as large makes no difference. Knowing we have the larger of the two amounts doesn't change our information about it, at all. Consider the uniform distribution on all integer powers of 2: ..., 1/8, 1/4, 1/2, 1, 2, 4, 8, ... Yes I know this probability distribution doesn't exist in conventional probability theory, but arbitrarily good approximations to it do exist: you might like to think of it as a "real number" addition to the set of rational numbers.

Working with this distribution (or with close approximations, and then going to the limit), it *is* the case that the probability distribution of 2X is the same as that of X, both have infinite expectation values, E(B|A=a)=5a/4 for all a (integer powers of 2), and P(B=2a|A=a)=0.5=P(B=a/2|A=a) for all a. So at the boundary of reality, steps 6 and 7 of the TEP argument do go through. The reason now why switching only gives an illusory gain is that when expectation values are infinite, you will always be disappointed by what you get. They are no guide to decision making.

Richard Gill (talk) 06:44, 1 August 2011 (UTC)

I have written a paper entitled Anna Karenina and The Two Envelopes Problem,. It expounds my explanation (the Anna Karenina principle) of why there is so much argument about TEP and contains some modest new results. Comments are welcome. It is not finished yet; when it is I'll submit it to a suitable journal. Richard Gill (talk) 15:10, 9 August 2011 (UTC)


 * Hi Richard, I have read your draft now. Here are 3 quick comments in case they can be of any value to you:


 * 1) Step 6 and 7 are not inconsistent taken together as you state on page 4. It is easy to find scenarios where they are both trivially true, without having to invoke Bayesian models of ignorance and improper priors as you do on page 5.
 * 2) I totally agree with you that it's a cheap way out (and hence incorrect) to simply pointing to the improper nature of the prior as the cause of the problem (p 5). However, your own explanation of TEP-2 is an extension of the argument you criticize! You say that the cheap reasoning is valid even in some bounded cases as well. "Expectation values are no guide to action" (p 9) in some cases, both unbounded and bounded. But then, when can we use E(X) as a guide? Sometimes or never? If we never can trust E(X) what should be used instead? If we can trust E(X) sometimes how do we know when we are in a "trusted" scenario? And when we are in an untrusted scenario, what measure should we use instead of E(X)? You neither state nor answer any of these obvious questions. Your only candidate for a general guideline is "one should be wary of infinities" (!) (p 10). Well, this is an extremely vague guideline, and exactly the same guideline the authors have that you criticize. And why only watch out for infinities? Your point (or rather Syverson's) was that infinities aren't always needed to get into trouble, right? So I guess you should have stated that one should be wary of "big numbers" as well as infinities, right? But then, how big is a big number? And besides, remember that it's very hard to see that the original TEP scenario has anything to do with infinities or very big numbers at all. So given an arbitrary probabilistic reasoning, how can we know in advance if we are allowed to compute and trust expectations in that case? Without a clear answer to this and the other questions above your solution to TEP-2 doesn't deserve to be taken seriously.
 * 3) I can't see that you have a single plausible candidate of a solution to any of the variants of TEP.  The irony is that you set out to say that this problem was very, very easy and that it's bizarre that so many articles have been written about it. The only explanation to this would be that philosophers, according to you, are trained to see problems everywhere even where there are none, and that they get paid writing stupid articles about non-problem. Now you are about to add a new "stupid" article to the list of written articles about TEP. And yet another article that doesn't provide even a hint to a decent solution to the problem(s). iNic (talk) 17:13, 14 August 2011 (UTC)


 * Thanks! I have responses to all your criticisms. Will write them later. And will improve paper. Richard Gill (talk) 08:04, 16 August 2011 (UTC)

Short reactions to iNic: Looking forward to your counter arguments. Richard Gill (talk) 08:19, 16 August 2011 (UTC)
 * 1) I believe that the only way to keep steps 6 and 7 together is to assume the (improper) prior on X: fractional part and integer part of base 2 logarithm independent, integer part being uniform on the integers. And if you'ld like steps 6 and 7 correct (subject to obvious arithmetic corrections) simultaneously with "2" replaced by any other number, you are forced to assume the prior: logarithm of X uniform on the whole line. If you disagree, please give me a counterexample or a reference. Of course, a theorem has assumptions. I am assuming some back-ground minimal probability interpretation of the intention of the TEP author.
 * 2)  Yes, we must be wary of infinities. And as I show, we must be wary of heavy tailed, skew distributions. For such distributions the expectation value is no guide to behaviour. You have to look at the whole distribution, or perhaps at a number of "interesting" quantiles. Therefore there is a solution to the problem and statisticians and applied probabilists know about it. It's a situation where life is too complicated for a single number, or just a couple of numbers, to be universal summary of the distribution (by universal, I mean a useful summary for a wide range of questions).
 * 3)  As far as I can see, I have surveyed known understanding and added some small extras to all the known variants. But I agree, I am not finished yet. I have some new insights on the 2Neckties which I am going to add in a new section, which will tie together the philosophers' solution "sin of equivocation"  and the mathematicians' soution "just do the probability theory carefully". You can get a preview in the newest section of the Arguments page of 2Neckties. Comments welcome


 * 1. Step number 6 only states that an envelope contains 2A with probability 1/2 and A/2 with probability 1/2. This can be realized in a number of simple ways. And if statement 6 is made true it is of course also true that the expected value of money in that envelope is as described in step number 7. These statements (6&7) are for sure not incompatible with each other. On the contrary, one even follows logically from the other.
 * 2. Well, I assume you mean that life is too complicated for you to answer my simple questions. Your new guideline for life: "we must be wary of heavy tailed, skew distributions" is of no help, unless you also give us a definition of exactly how "heavy tailed" and "skew" a distribution have to be to have us running away from it, screaming in anguish and fear. Or what else should we do? You are clueless. You don't have any "guide to behavior" when facing such monsters. What is perhaps the most frightening thing is that when reading the TEP story I can't see any heavy tailed, twisted monsters at all. And yet they must be there lurking between the lines all the time. Scary! Your "solution" is so extremely vague and incomplete that it doesn't deserve to be called a solution at all.
 * 3. You ought at least admit now that your initial view of this problem as something extremely simple was utterly wrong.
 * iNic (talk) 23:49, 16 August 2011 (UTC)


 * 1. Step 6 perhaps was intended to say that conditional on envelope A containing a, envelope B contains 2a or a/2 with equal probability. There is only one way to make this true whatever a might be: entier(log_2(A)) uniform and independent of frac(log_2(A)). In this case, step 7 is correct but step 8 is wrong, since in this case, expectation values have no relevance to decisions. Or, step 6 was perhaps intended to say that the contents of the second envelope is equally likely twice as half that of the first. Well, we were told that in advance! Then step 7 is wrong. The uncondtional expectation value of what is in envelope B is a number, not a random variable. Sorry this is not more simple. That's the point of Anna Karenina.


 * 2. I am not clueless. In such situations the amateur should consult a professional. I do not run away screaming. The whole point is that in such situations care is needed and the right questions have to be asked and further data will be needed.


 * 3. Sure, the problem turns out to have (for me) unsuspected richness. But at the end of the day it remains extremely simple (for me, at least).


 * How do you like my unified analysis of Kraitchik's 2Neckties problem, Schrödinger's 2SidedCard problem, Littlewood's 2SidedCard problem, and Gardner/Nalebuff's 2Envelope problem? And how do you like the history of the problems? They are carefully crafted jokes by professional mathematicians, designed to tease laypersons. Seems like they did a good job! Richard Gill (talk) 08:39, 17 August 2011 (UTC)


 * BTW, you are partly right in your characterization of the problem, iNic. It is indeed a problem in "subjective Bayesian decision theory". But it is a mathematical problem, not a philosophical problem. If you want to solve decision problems by subjective probability, you had better understand the calculus thereof: Kolmogorov Rules! *And* have a rich enough language and conceptual framework in order to make the necessary distinction: that between an unknown thing on the one hand, and your rational subjective beliefs about that thing on the other hand. Between a and A. From this point of view, TEP *is* bloody simple. Richard Gill (talk) 08:59, 17 August 2011 (UTC)


 * 1. I toss a coin and if heads show up I put 2A in an envelope, if tails shows up I put A/2 in the same envelope. This is a simple way to realize what is stated in statement number 6. If number 6 is true it follows deductively, by ordinary logic, that statement number 7 is true as well, as this is just a valid instance of the definition of expected value. In logic notation: "statement 6" -> "statement 7".
 * Remember that we started with someone putting X in one envelope and 2X in another, then you chose one of them at random? I do hope you understand that this is the basic background context for all these discussions. Richard Gill (talk) 12:17, 17 August 2011 (UTC)


 * I just mentioned one possible and very simple way to realize statement 6. There are of course others. The important thing is that however you imagine to realize statement 6 it will always in a valid way imply statement 7. You can't escape this. iNic (talk) 01:32, 18 August 2011 (UTC)


 * So what? We are talking about TEP. There is a well defined context. Did you want me to add: "..., in the present context" to every sentence I write?Richard Gill (talk) 03:02, 18 August 2011 (UTC)


 * In mathematics it's crucial to state all assumptions explicitly. If I write for example an equation like this $$a^2+b^2=c^2$$ then I have to specify explicitly if I for example only intend integer values on the parameters a, b and c, or if I allow the parameters to be any real number. In your draft you have not specified explicitly all the restrictions you have in mind for statement 6. The words "in the present context" added at the end is way too vague. As it is stated in your draft, that step 6 and 7 are incompatible, is simply not correct mathematically. Put these two sentences on a peice of paper and send to any mathematician in the world, and you will get it confirmed that they are not contradictory. You have to be much more precise when you do math than what you are here. iNic (talk) 13:00, 18 August 2011 (UTC)


 * 2. OK, please consult a professional then! This is just another non-answer to some of the valid questions posed by TEP.
 * No, *you* should consult a professional when you next meet one of these problems! I'm happy to give you free advice. Richard Gill (talk) 12:17, 17 August 2011 (UTC)


 * OK Richard, this is getting a bit silly now. But don't worry, no one can answer the tricky questions I stated above, so no need to feel embarrassed just because you can't answer them. This is precisely why TEP is still an open problem. If you or someone else could satisactorily answer these questions this would not be an open problem anymore. iNic (talk) 01:32, 18 August 2011 (UTC)


 * I didn't notice any tricky questions. Certainly, not regarding TEP. Richard Gill (talk) 03:02, 18 August 2011 (UTC)


 * I count to 10 question marks that you haven't answered. And yes, they are all related to your suggested solution to TEP. iNic (talk) 13:00, 18 August 2011 (UTC)


 * 3. I haven't found any unified solution to all these problems by you. Or is your unified solution that they are only jokes and should therefore not be taken seriously? It's interesting to note that you are contradicting yourself. Now you say that TEP is a joke made by professionals to tease lay persons. Previously (a week ago or so) you had the strong conviction that TEP is only a problem interesting for professionals and that it doesn't attract any interest from lay persons. Your support then for this view was that no account of TEP could be found in books for lay persons, only a lot of research articles by professionals. Did you shift your opinion 180 degrees since last week? iNic (talk) 11:51, 17 August 2011 (UTC)


 * The unified solution was either on the Talk/Arguments or the Talk page for TEP. Didn't you notice it?


 * A week ago I read all the original TEP problems, in books written by mathematicians with the deliberate and explicit purpose to tease ordinary people. (Littlewood 2SidedCards, Schrödinger 2SidedCards, Kraitchik 2Neckties, Gardner/Nalebuff 2Envelopes). The problem never became famous and popular among lay people, unlike for instance Monty Hall Problem. It produced a lot of academic interest by philosophers, economists, decision theorists, probabilists and statisticians. Richard Gill (talk) 12:17, 17 August 2011 (UTC)


 * I have read your "unified solution" below over and over now but I don't understand it. Can you please apply this theory to the different problems you claim you can solve using this tool? Can you solve all different variants of TEP with this theory as well? That's impressive in that case! You say that this theory only fails when E(A) = E(B) = infinity. But what happened to our friends, the twisted monster distributions with meaty tails? Are they too complicated (or scary?) for a Bayesian subject to even imagine? iNic (talk) 01:32, 18 August 2011 (UTC)


 * Application to Littlewood's, Schrödinger's, and Kraitchik's problems: see TEP talk and talk/argument pages, bottom. Application to TEP: let a and b be the (unknown) amounts in the two envelopes. Because we are rational, our subjective beliefs, and how our our beliefs should be updated on receiving new information, are encapsulated in a (joint) probability distribution (satisfying Kolmogorov rules - by theorems of de Finetti, Savage, and others...). I define A and B to be two Kolmogorov-style random variables having this joint probability distribution. Because we picked an envelope at random and called it Envelope A, our beliefs about a and b are symmetric on exchange of a for b. The theorem now tells me that if the expectation value of A is finite then the amount in Envelope A is statistically dependent on whether it is smaller or larger, and vice-versa. Thus if the expectation value is finite, steps 6 and 7 are (in the context of the problem) invalid. If on the other hand the expectation value of A is infinite, expectation values are no guide to behaviour anyway, and it is step 8 which is invalid. What don't you understand? I'm going to add a section to my paper "Return of the Two Neckties". So it's useful for me to know how/if the idea comes across. As you said, it is a problem in subjective Bayesian probability and decision theory. It's a mathematical problem and I've given you a mathematical answer. At a higher (perhaps philosophical) level, you can say that - with finite expectation values - the problem is not distinguishing between our beliefs about a with or without further information, e.g., the information that it's the smaller or larger of the two. When you add that information you should (to be self-consistent or rational) modify your beliefs. The distinction of the philosophers is taken care of in careful mathematics by using a different symbols for random variables and their outcomes. And keeping track of probability distributions (conditional or unconditional, for instance). I don't see what is not solved in the infinite expectation case. Exchanging the two envelopes (unopened) does not change your beliefs about what is in your envelope, so exchange is useless anyway. We already know that that's the case whether or not the expectation value is infinite. We have completely described what goes wrong in the argument. Moreover the explanation can be copied to all the known variants of the problem (2Sided Cards, 2Neckties,...). It takes care of TEP-1 and TEP-2 (TEP-2 is the family of infinite expectation variants, without improper priors). I'm not scared of fat-tailed distributions. They occur all over the place in the real world: climate-change and meteorology, seismology, astrophysics, finance, safety of complex industrial systems, self-organised criticality, phase transition. Usually paired with long time dependence. Everywhere where you see fractals and self-similarity, and more. So a wise Bayesian probably also uses them for their personal beliefs about such phenomena. Richard Gill (talk) 03:02, 18 August 2011 (UTC)


 * I have to think about this a little bit more. I'm not trained in Bayesian probability so that might be the problem with me. iNic (talk) 13:00, 18 August 2011 (UTC)


 * Take your time. I'm really greatful for your persistent criticism. A problem with Bayesian probability? I'm using the fact that Bayesian probability satisfies the same rules as frequentist probability. So mathematically, we can think of anything unknown as being a random variable. Well: there are maybe some Bayesians who would drop countable additivity, or who would allow improper priors. Schrödinger's 2SidedCards involves a highly improper distribution. Littlewood finds that enough to rule it out. Richard Gill (talk) 13:26, 18 August 2011 (UTC)

Unified Solution
Considering the two scenarios that your envelope contains the smaller or the larger amount does not change what is actually in it; but intuitively, it should change your beliefs about how much is in it. In particular, it should change your *expectation value* of how much is in it.

This can very easily be made completely rigorous. For it is trivially true that E(A-B|A-B > 0) > 0. So it follows (provided the second, "smaller", term is  finite) that  E(A|A > B) > E(B|A > B). By symmetry, E(B|A > B) = E(A|B > A). Which gives me what I want: E(A|A > B) > E(A|B > A).

Because the conditional expectations in the two situations differ from one another, they must differ from the unconditional expectation, and the corresponding conditional probability distributions P(A=a|A > B) etc. must be different from one another and from the unconditional distribution.

Moreover, because of the symmetry of statistical (in)dependence, P(A > B|A=a) can't be independent of a. Knowing A changes your chances that it's the smaller or larger of the two amounts.

Summary:

1. Our beliefs about A are necessarily different when we know if it is larger than B, or smaller, or know nothing.

2. Our beliefs as to whether A is larger or smaller than B must be different when we know what A is than when we don't. More precisely, they will depend on how much it is and not always be the same.

The complementary nature of 1. and 2. is an expression of the symmetry of statistical dependence and independence. One could say, it's just "the other side of the coin"!

The only exception to these rules might occur if E(A) = E(B) = infinity. But then expectation values are no guide to decision anyway.

The central assumption here is that the probability distribution of (A,B) is the same as that of (B,A),and the two amounts are different. And I'm talking about the probability distribution which encapsulates our prior beliefs about the two numbers. So in other words, our prior beliefs about the two unknown amounts are unchanged on exchanging them. This is also the central assumption in 2Neckties, Littlewood2SidedCards, and Schrödinger2Sided Cards. It is true for Nalebuff's TEP-1 and it remains true for TEP-2 (that's the one where we show that prior beliefs are possible such tthat E(B|A)>A.). Richard Gill (talk) 12:05, 17 August 2011 (UTC)


 * You have said that it's a cheap trick to blame it on infinity, but yet you have that as a special case in your "unified solution." Is it not a cheap trick anymore, or what? Along with Syverson you have earlier said that infinity is not significant at all and that exactly the same problems can occur with distributions having finite expectations. In other words, you don't have to have an infinite expectation to have decision theory break down. Well, I can't see that observation anywhere represented in your unified solution. You have only two cases: either E(X) is finite and then some rules apply, or E(X) is infinite and then some other rules apply. You say that you are not afraid of fat tailed distributions, and that's great. But how about to show that? In this unified solution of yours they are not mentioned with one word. If you're not afraid of them and you think they are very important both for this problem and otherwise, as you do, you have to mention them. iNic (talk) 09:25, 20 August 2011 (UTC)


 * As I understand the central idea in your "unified solution," it is the observation that if you know which envelope contains the larger amount then your expectation of what to find there is larger than for the other one. Well, eh? If I know which envelope contains the larger amount I would simply take that envelope and skip all stupid calculations. Who need any calculations at that point? If you want you can stay and try to calculate my "expectations" while I will go and buy something nice to me using the money in the envelope. But I really hope I have totally misunderstood your central idea! If not I'm afraid I have to confess that I think this is the most stupid proposed solution to date. Please publish it and see what your colleagues will think about it. iNic (talk) 09:25, 20 August 2011 (UTC)


 * Yes, you misunderstand, completely. We don't know a. But we imagine what we would believe about b if, counterfactually, we were informed of the value a. Our initial (joint) beliefs about a and b would have to be updated. According to the subjective Bayesian decision theory of Savage, de Finetti, ..., if we are rational, we must update our beliefs in the Bayesian way. (It's a problem in subjective Bayesian decision theory, right?)


 * As far as I know only real, actual, observations and other actual new information requires Bayesians to update their set of beliefs and their corresponding probabilities. Mere assumptions, or theoretical conjectures, doesn't have the mandate to update any belief system. iNic (talk) 22:00, 20 August 2011 (UTC)


 * Here I think you are wrong. We are deciding whether or not to switch envelopes. The argument in TEP requires us to consider whether or not that would be wise, given the amount in our envelope a. And steps 6 and 7 appear to show that whatever a might be, we have a positive expected net gain on switching. And therefore we need not look in the envelope to inform our decision, but should switch anyway. We don't actually get any new information and we don't actually update our belief system. But we consider what our beliefs would be about the contents of Envelope B,  if we were informed of the amount in our Envelope A. Richard Gill (talk) 08:20, 21 August 2011 (UTC)


 * OK, so if I understand you correctly now you are saying that your calculation is the only true Bayesian calculation that should be applied in this situation, right? But how come all Bayesians before you did the calculation in the wrong way? And how does your new way of updating Bayesian beliefs fit within the broader Bayesian philosophical picture? Does it already maybe fit nicely within some existing Bayesian system? Which one in that case? If this is the only correct way to update Bayesian beliefs, what is wrong with the other Bayesian proposed solutions and adaptations of Bayesian systems to account for TEP? In particular, why are the three suggested adaptations of Bayesian systems proposed by Jeffrey, Cox and Lindley not correct? In what way(s) is your solution superior to their solutions? Can you give some other examples on where your principle should be used, where more usual ways of doing Bayesian belief updating leads to demonstrably wrong or absurd results? In short, to be convincing you need to contrast your new Bayesian theory with the ones already published and list the pros and cons with your theory versus theirs. If you successfully can accomplish this you really have a true final solution of TEP! iNic (talk) 19:14, 22 August 2011 (UTC)


 * Jeffreys, Cox and Lindley not correct?: David Cox is not Bayesian. Jeffreys and Lindley see a role for improper proofs within statistical inference, where one is faced with the task of communicating results to users who do not necessarily have your prior. Decision theorists already understand the problems of improper priors. A few do not demand countable additivity (Dubins). Within the intended cntext, there is no controversy about all ths. And I don't see relevance to TEP. I do not have a new Bayesian theory. I'm not aware of any Bayesian who disagrees with me. Richard Gill (talk) 05:18, 23 August 2011 (UTC)


 * OK, so this means that everyone is happy now? Great! You all solve TEP in different ways but that doesn't mean that anyone is wrong. All solutions are correct at the same time. Amazing. Anna Karenina. I thought your Unified Solution meant "Unified Unique Solution" but if you wanted to claim that you would of course have called it that. My bad, sorry. I don't think I have any more comments. I sincerely wish you Good Luck with your paper! iNic (talk) 09:33, 23 August 2011 (UTC)


 * I've been approached by two journal editors urging me to submit the paper, when it's finished, to their journal. And had positive feedback, along with some useful suggestions for improvements and clarifications, from them and from other colleagues. Including philosophers.


 * Great! I look forward to read your final paper. iNic (talk) 22:00, 20 August 2011 (UTC)


 * Are you a mathematician? Do you know probability theory? Richard Gill (talk) 10:27, 20 August 2011 (UTC)


 * No and yes. iNic (talk) 22:00, 20 August 2011 (UTC)


 * OK, thank you for confirming that I haven't understood your idea at all. The Unified Solution is probably too hard for me to grasp but let's make a final try. Let's take a concrete example: I'm offered one of two envelopes where the contents have been determined using the Broome distribution, by flipping a coin. You can flip a coin say 30 times a minute and we need to finish our flipping during one day. So this is the theoretical limit on the series of flips. (Of course, our actual flipping ended very soon after we started.) So the Broome distribution will be theoretically truncated and will thus have a finite expectation. I pick one of the two envelopes offered. Now I know, even before I look inside the chosen envelope, that whatever I will see I will want to have the other envelope instead. I look into my envelope and sure enough, I want to have the other one instead. How is your Unified Solution solving this paradoxical situation? iNic (talk) 22:00, 20 August 2011 (UTC)


 * Very good question! I will think about this. Richard Gill (talk) 08:21, 21 August 2011 (UTC)


 * Thought over. So I am to flip a coin with 2/3, 1/3 chances of tails, heads, till I get the first head; max 30 tosses a minute, max one day long. And then, if I had to toss the coin altogether n times, I'm to put 2 to the power n dollars in one envelope, half that amount in the other. Then you'll pick an envelope at random. I would like to know if you are willing to pay me in advance, to play this game with you. And I would like to know what the sanction will be if I can't or won't pay up. This is the St. Petersburg paradox all over again. Richard Gill (talk) 09:36, 21 August 2011 (UTC)


 * I guess that if you don't look in your envelope, you are neutral to switching or not. Let's consider the game where I stop my coin tossing at one day (24 hours) at the latest, and you do take a look. I could have had to toss the coin 43200 times. The expectation value of the amount of money I should put in the two envelopes is roughly of the order of 10 to the power five thousand four hundred (that's a 1 followed by 5400 zero's) dollars. At the end of the game, one of us gets one third of this, the other gets two thirds. So if you want to play it with me, I'm going to ask you something like that amount up front in advance, in order to play. This might remind you of that story about the guy who asked for one grain of rice on the first square on a chess board, two on the next square, four on the next, and so on (up to 2^64). Truncated at 43200 biased coin tosses, or equivalently at 10 to the 5400 dollars, then I can tell you that if you are allowed to look in your envelope, and if your envelope contains 2^n dollars, neither the smallest nor the largest that is possible, then the other envelope contains twice this with probability 2/5 and half this with probability 3/5. At the two extremes the other envelope is either certainly larger or certainly smaller. Obviously we can easily play this game with numbers written on slips of paper in the two envelopes. You can easily simulate it on a computer. Go ahead and take a look (you're a C programmer, right!?). But to turn it into a real life gamble, we are going to have to associate those numbers, or getting the larger or the smaller of the two, with financial gains and losses; with wagers and pay-offs. It won't be possible in the real world to do that "linearly". Casinos have finite capital and restrict the maximum size of bets. Note that in the Broome game, if your envelope doesn't contain the smallest possible amount, the other envelope is more likely half than twice yours. If your envelope contains a rather large number, you should start to be getting worried whether or not I really have the resources to actually pay out twice. And anyway, the amount of money is so large, there's no point in going for twice. All of this says that if we play something like the Broome game in the real world, the expectation values for the game truncated at 24 hours are pretty irrelevant. There will much, much sooner have been extremely strong deviations from linearity. Richard Gill (talk) 13:50, 21 August 2011 (UTC)


 * This is not the St Peterburg paradox, and you should know that. This is TEP-2 using a distribution with a bounded support, making the expectation finite. TEP-2 should according to you be solvable using your Unified Solution strategy. Please show me how that is done! This is what I asked you to show me. So far you haven't given a single hint on how to do that. If you can't solve TEP-2 using your Unified Solution please don't say that it can be solved with your Unified Solution! I'm actually more confused than ever what your opinions are regarding TEP. Your previous sacred principle, coined the Anna Karenina principle, said that TEP can't have one solution but many different ones. Now you are instead suddenly defending this Unified Solution that purportedly should solve all TEP variants in one blow, as well as some other old puzzles. This is exactly what you said before over and over was totally impossible to even imagine. So what happened to your Anna Karenina principle? You are so full of contradiction that you are becoming a puzzle yourself... iNic (talk) 00:07, 22 August 2011 (UTC)


 * Anna Karenina principle says that there needn't be *one* solution to TEP-1. It depends what the writer had in mind, and we don't know what he or she had in mind. But the Broome example belongs to TEP-2. My little theorem says that in truncated Broome, the expectation value of the gain from switching depends on how much is actually in your envelope. So, if you *really* care about expectation values, you should not switch regardless: you should look in your envelope first. If it contains the largest possible amount, don't switch!!! Otherwise, do. On the other hand in this example the expectation value is so close to infinite, we are very close to the situation with infinite expectation. Expectation values are no guide to behaviour since the long run is never ever going to kick in, even if we repeat the game every day for the rest of the life of the universe. So in actual fact, we shouldn't care about expectation values at all. In fact, from the point of view of trying to model real world economic games, this is a game which no casino will ever be prepared to play with you, at any price. Just as in the Saint Petersburg paradox. Any *realistically* truncated version of Broome (one which a bookmaker or a casino would happily set up for you) will have to be much more severely truncated than at 24 hours. And the more it is trunctated the more need to look in your envelope before deciding whether to switch or not. So in my opinion it all fits together nicely. Nice example. Richard Gill (talk) 12:18, 22 August 2011 (UTC)


 * OK so TEP-1 necessarily has many solutions. At the same time all different TEP versions — TEP-1 included (!) — only have one solution, the Unified Solution. Well, I still don't get the logic here but let's skip this point for now. I bet it will be very clear in your paper. I think we are finally making some progress when you now admit that even fat tailed distributions with bounded support and finite expectations can cause decision theory to break down! Please insert this observation into your Unified Solution statement! You also need to specify when a tail is so fat that E(X) can't be trusted anymore. When exactly does that happen? The Little Red Cap need to get precise instructions to be able to distinguish between grandma and the big bad wolf. Without such instructions we are as lost as the little red cap. iNic (talk) 19:39, 22 August 2011 (UTC)


 * "The Unified Solution" is just the mathematical kernel of the whole family of paradoxes. (TEP, Neckties, 2Sided cards...). You can use it to invent new paradoxes. Little Red Riding Cap just needs to be less lazy. Look at the probability distribution, not just the expectation value. Richard Gill (talk) 05:30, 23 August 2011 (UTC)


 * OK, and being less lazy means that she has to consult a professional. You are professional and are willing to help her for free. Your only advice is that she should not be that lazy. And being less lazy means that she should consult a professional. You are a professional... Not only is the little red cap running around in circles, so are we. I don't think I can make my point more clear than what I've done so I think our debate ends here. Thank you for a fun conversation! Hope it has been of some little value to you. iNic (talk) 09:47, 23 August 2011 (UTC)


 * I didn't say that being less lazy means she should consult a professional. Consulting a professional is the lazy alternative. She should sit down and try to compute the probability distribution herself, and compare various quantiles with the expectation value.


 * OK let's say she can compute the probability distribution and the expected value. And then what? Which "various quantities" should be compared to the expected value? And when she has compared them, then what? Let's say some of the "various quantities" are larger than E(X) while others are smaller. What shall she do with that information? At the end of the day, can she trust grandma E(X) or not? If not, why not? You can't answer these questions but you don't want to admit that. Your way to escape the situations is to say that the little red cap should be able to figure this out on her own! If she is not too lazy... But who is the lazy one in this story, really? iNic (talk) 04:28, 24 August 2011 (UTC)


 * I think you are being lazy, refusing to think about your very own truncated Broome example. I asked some questions which you didn't answer. You can compute the expectation value of the amount in a random envelope, you can compute the probability that envelope A contains at least that amount can't you? Have you done it yet? Do you think that expectation value is relevant? Do you think your own truncation is realistic as a model of some real life gambling game? The point is that each specific problem is going to have to be solved using specific aspects from that particular real world set-up. There is no generic answer (except: consult with a professional, or use your own brains). Real statistical consultation is done in a dialogue. We can do it if you like, but then you have to work, too. Richard Gill (talk) 07:05, 24 August 2011 (UTC)


 * Sure I can assist you in computing a lot of things if that would help you. But what's the point doing that if you don't know how to interpret the results anyway? Of course it is a realistic game I proposed. If you want we can play it just for fun. I'm sure you will get the feel for how real it is. If the original Broome distribution with infinite (unbounded) support is realistic (you never complained about that!), why would it not be realistic to use the same distribution but now with a finite (bounded) support? It's something inherently irrational to complain about non-realism only when something infinite has become finite! If you don't like big numbers you should hate infinities, not love them. You can't be cool when confronted with infinities and get upset over some finite big numbers. What kind of finitism is that? What is it called? If your main argument against the bounded Broome is that it's unrealistic, why didn't you say anything about this when discussing the infinitely more unrealistic unbounded Broome? You have always since you started to comment TEP had the opinion that infinity should not be a problem in itself. (I agree with you here 100%.) All mathematics is full of it, both in theory and for making realistic models of the world. This you said over and over. And now you have suddenly become a very harsh finitist it seems, where even modestly large finite numbers should be banned from the realm of the thinkable. But the problem remains: Decision theory claims that E(X) should be our guide when making decisions. The St Petersburg paradox, that you now suddenly refer to as relevantt (you never talked about this paradox in your explanations of TEP before), brought to the table the idea that it's the expected utility, E(U), and not E(X), that is the true guide for decisions. Reason is that this idea solves some basic versions of the St Petersburg paradox. But how will that idea help you to solve TEP? That idea is useless here. You seem to claim that situations leading to TEP should not be possible to even imagine, and things that you can't imagine doesn't exist, and what doesn't exist we can safely ignore. But it happens to be that case that I can imagine the real world situation where someone has two or more envelopes on a table with unknown content. And that's all you need to be able to imagine to get the ball rolling. And it happens for real all the time, over and over again. You say that "There is no generic answer..." which is nothing else than giving up the idea of finding a general guide of what do do in different situations. Do you mean this is true in general? For sure, in some situations E(X) is the correct guide for how much to bet, isn't it? Or do you really claim that E(X) is never a reliable guide? If E(X) is never reliable and there is no other concept that can replace it (other than the very strange mathematical principle "consult a professional") then what you are saying is in effect that decision theory is dead. Is this your opinion? iNic (talk) 17:03, 24 August 2011 (UTC)


 * The problem is not to find another way to calculate that doesn't lead to contradictions (that is easy), but to pinpoint the erroneous step in the presented reasoning leading to the contradiction. That includes to be able to say exactly why that step is not correct, and under what conditions it's not correct, so we can be absolutely sure we don't make this mistake in a more complicated situation where the fact that it's wrong isn't this obvious. So far none have managed to give an explanation that others haven't objected to. That some of the explanations are very mathematical in nature might indicate that at least some think that this is a subtle problem in need of a lot of mathematics to be fully understood. You are, of course, free to disagree! You of course are free to disagree, but I think that a rather comprehensive understanding of TEP (in its many variants) requires more intuitive understanding of probability theory than most laypersons have (in advance). But of course they can use TEP in all of it's ramifications in order to gain such feeling for the subject. For instance, take the Broome example, truncate it like you propose, and see where the expectation value is, in that distribution. Which quantile is it? Ask yourself how to implement the Broome example in a real casino. These are experiences which you need to have yourself. That requires some mental labour. Or be lazy, and trust a professional. Your choice. Richard Gill (talk) 16:18, 23 August 2011 (UTC)


 * When discussing the original Broome distribution, with its infinite expectation and unbounded support, you felt no need to talk about big numbers or unrealistic casino scenarios. But as soon as the Broome distribution was limited to become a finite distribution, in every sense of the word, you suddenly felt that you had to be carried away thinking about big numbers and how incredibly large they are. Isn't this quite funny? If you can think about infinities in your daily work without getting baffled, why get baffled by some finite big numbers that are necessarily very small compared to infinity? Laypersons may be baffled by big finite numbers, but not mathematicians. That is not credible. Moreover, mathematics is about numbers abstracted from any particular unit. Even small children know this. Every big finite number is just a scale factor away from the number one. So all your arguments to the effect that big numbers can't be realized "in the real world" are just crap. Besides, mathematics is not about the real world anyway. iNic (talk) 04:28, 24 August 2011 (UTC)


 * Oh, mathematics is not about the real world! Well, then there is no paradox. Richard Gill (talk) 21:59, 24 August 2011 (UTC)


 * But feel free to ignore these ideas if they don't help you. Wikipedia has to rely on "reliable sources", right?, and we editors certainly can't use it to promote "own research"! Moreover, secondary and tertiary sources are preferred to primary. So *you* can safely ignore all recent academic research on TEP. But I hope my ideas are useful to other editors, suggesting ways to organize all the material "out there" in an intelligent way so as to serve the interests of Wikipedia readers. Richard Gill (talk) 10:38, 20 August 2011 (UTC)


 * I'm commenting your Unified Solution because you explicitly asked me to do that. I'm happy in case it can be of any help to you. Of course I know that this is not directly related to the Wikipedia article as such, as it's OR. If you don't need or want my comments anymore, please just let me know and I'll stop immediately. iNic (talk) 22:00, 20 August 2011 (UTC)


 * It is important, if you want to communicate with laypersons, to understand how they think, and for this you must ... communicate with them. So for me such discussions are a valuable learning experience. You learn the most from understanding people who disagree with you. Richard Gill (talk) 08:20, 21 August 2011 (UTC)

Mystery of Julius
I agree, it is a valuable learning experience for me too to talk with you and try to understand how you think. For example, I believe the reasoning in The Mystery of Julius: A Paradox in Decision Theory, by Charles Chihara, is quite similar to your Unified Solution ideas. In particular, I suspect that the calculations on page 11 are utilizing essentially the same idea as your Unified Solution calculations. Am I right? iNic (talk) 21:04, 5 September 2011 (UTC)


 * That's an amusing philosophy paper, thanks, I did not know it yet! Yes, on page 11 Chihara shows *in a special situation* that the expected amount in envelope A is larger and smaller respectively, when it is given that it contains the larger or smaller of the two amounts. His special situation is generated as follows. Two envelopes are either filled with k and 2k dollars, or with k and k/2 dollars, each with probabilty 1/2. Then one of the two envelopes is then chosen at random and called Envelope A, the other is called Envelope B. My analysis shows that his conclusion is true in general, however the envelopes are filled with money, as long as the two amounts are different, envelope A equally likely to contain the larger as the smaller amount, and the expected amount is finite. Richard Gill (talk) 19:28, 6 September 2011 (UTC)

Yes but this is the minor difference. The major difference is that he uses this calculation only in the case when no envelope is opened while you want to use this explanation to all versions of TEP. Chihara uses a different explanation for the open envelope case, an explanation that we now know is false. I agree with you in principle that the explanation for all 'cases' should be the same. However, the calculation that you and Chihara have invented doesn't explain a thing, not even the 'closed case.' If I throw an unbiased die for example, you and Chihara are telling me that I don't think that the expected value is 3,5 points. Instead you say that I have two different ideas on what the expectation is. Either the die will land with 4, 5 or 6 points up, or else 1, 2 or 3 will come up. In the first case my expectations are higher than in the second case. With equal justice I can instead say that either an odd number of points will come up or else an even number of points will come up. In the first case my expectation is slightly lower than in the second. This means that I don't have one unique expectation but several sets of expectations, each one paired with a particular partition of the sample space. For obvious reasons this idea can't explain anything. That two men has come up with this crazy idea independently of one another is the real mystery here. iNic (talk) 14:36, 9 September 2011 (UTC)


 * I don't see a problem here. My "expectations" about the outcome of a toss of an unbiased die are not encapsulated in a single number "7/2" but in six probabilities 1/6, 1/6, ..., 1/6. So my beliefs also entail how I would transform them under learning further information, e.g. whether or not the outcome is even. I can imagine in advance how my beliefs would change, whether or not I am given further information. Richard Gill (talk) 07:50, 10 September 2011 (UTC)

OK, so the six probabilities of an unbiased ordinary die is not necessarily 1/6 each?!? Gosh. And you can transform these probabilities at will to become something else by you just thinking about the idealized die... Oh my. Your statements are really becoming weirder and weirder. Is this what you teach your students in probability classes? I hope not. If you do I feel sorry for them because your occult concept of probability is in direct conflict with the Kolmogorov axioms of probability. iNic (talk) 00:03, 12 September 2011 (UTC)


 * Please think before you write. First of all, I did not write that the six probabilities are not necessarily 1/6 each. In fact I agreed with you that talking about an unbiased die means that we are talking about six probabilities all equal to 1/6 for the outcomes 1, 2, ..., 6. Secondly, I also said that from this probability distribution we can also compute the probability distribution of the outcome given that it is even, or the probability distribution given that it is odd. I can imagine tossing the die, but not observing the outcome itself. Instead, I ask a friend to look and to tell me whether the outcome is even or odd. I know in each case how my beliefs will have to be adjusted; namely to: 0,1/3,0,1/3,0,1/3 or to 1/3,0,1/3,0,1/3,0. I can imagine how my beliefs would change whether or not I actually am given this information! By the Bayesian theory of decision making under uncertainty, if we are logically consistent, our prior beliefs can necessarily be encapsulated in a Kolmogorovian probability framework, and on receiving new information, our beliefs are necessarily updated by Kolmogorovian conditioning.

Your friend will see an odd number with probability 1/2 and an even number with probability 1/2. And it doesn't matter if your friend is imagined or not. Adding this crucial piece of information to your equations and you will end up with the expected value Kolmogorov and I would suggest: 7/2. So what did you gain by adding a friend? Nothing. BTW, Kolmogorov was never a Bayesian, he never thought that probability theory was about psychology. So to honor him, please use "Bayesian framework" or something similar instead of "Kolmogorovian framework" in these contexts. iNic (talk) 18:57, 15 September 2011 (UTC)


 * Kolmogorov was indeed a frequentist as far as the application of probability theory to the real world is concerned. But his mathematical theory is neutral as to what the meaning of probability is. After all, it is a theory about sets, functions, and so on; the words used to label the objects and operations in the theory, like "probability", "expectation", and so on, are mere labels. Suggestive labels, to be sure. But you can replace "P of" by "the frequentist probability of" or "the subjectivist probability of" freely, as you like. That's the whole point. De Finetti, Savage and others (including modern mathematical economists) have argued that a self-consistent actor in an uncertain world behaves as if though he is making decisions according to expectation values with respect to a probability measure, and updating his beliefs according to probabilistic conditioning. Kolmogorov rules the Bayesians, as well as the frequentists. Richard Gill (talk) 06:38, 16 September 2011 (UTC)


 * I don't agree at all with the view that the Kolmogorov theory is neutral to interpretations in this respect, but I admit that this is OR on my part so I rest my case. iNic (talk) 07:19, 16 September 2011 (UTC)


 * OK, and as you say that E(X) can't always be trusted that must mean that Bayesians are not always rational, right? Saving the Bayesian paradigm is what this is all about. iNic (talk) 08:27, 17 September 2011 (UTC)


 * The Bayesian paradigm has been saved. In TEP-1, a rational (self-consistent) subjective Bayesian decision maker has a proper prior distribution representing his prior beliefs concerning x, the smaller of the two amounts. If he imagines an amount a in Envelope A, his beliefs about b, the amount in Envelope B, must necessarily depend on a. So the step in the original argument where the conditional expectation is calculated is always wrong. OK, so then some smart guy came up with TEP-2. But any example having the property that E(B|A=a) > a for all a necessarily has E(X) infinite. In the real world, money is bounded, and indeed, the utility of money is bounded. The rational subjective Bayesian decision maker will certainly not have E(X) infinite according to his prior beliefs, and anyway, he will base his decision on his utility of receiving any particular amount of money, and his utility is certainly non-linear, bounded, and possibly even not monotone in the amount of money (at some point, having even more money becomes more and more unpleasant for the individual who has it).


 * Here you simply ignore the fat tailed distributions. Again. As always. You seem to forget them all the time, even when I beg you not to forget them. You seem to have created yourself a convenient blind spot for these distributions in this context. I've always been fascinated by the human brain and how it tackles difficulties it can't handle. This is one way to do it: simply ignore the things that don't fit the mental picture the mind have created. Ignored problems don't have to be dealt with. Problem solved! Or at least moved out of sight. It has also been fascinating to observe how you, a proud mathematician not afraid of infinities of any kind, suddenly became an ad hoc finitist, as I call it. That is a person that uses finitist arguments only when everything else fails. In all other contexts these persons are not finitists at all. I can't help but finding these psychological phenomena extremely fascinating. iNic (talk) 23:40, 20 September 2011 (UTC)


 * I beg to differ. The logical paradoxes of TEP have been resolved. That's a question of logic and maths and language, not (directly) about the real world, as you said yourself. The problem of fat tailed distributions remains alive and kicking for anyone who uses expectation values to guide decisions or to summarize the main features of probability distributions. Remember the Broome distribution. We were having a discussion about it and I recall that at some point you were no longer interested in talking about it. We need to do some concrete investigations and computations ... I am not going to do them on my own, but I am interested in doing them together with you. This is a situation where everyone concerned needs to "grow" their own understanding, their own intuition by their own investigations and interaction with the phenomenon. Practical recommendation: compute median and mean and find out which quantile of the distribution is at the mean. Also look at the quartiles and look at a boxplot. Learn to understand the beast. Adapt the Broome model to the following situation: amounts of money are whole numbers of Euros; the total amount of money in the game is capped at 3000 Euro. You the player are one of many players who go to a casino to play the game. What would a casino charge you, to play the game? The casino wants to make a decent but modest long run income with tiny chance of ever going bankrupt. They have initial reserves of say 1 000 000 Euro. There are at most 100 "plays" a day. Study first the closed envelopes case and secondly the more interesting case where the player is allowed to look in the envelope before deciding whether or not to switch. The same problems are solved every day all over the world by insurance companies and individual actuaries. It's big business and it's a big, complex science. And you want a two line "solution" for the masses of wikipedia readers? Perhaps you do not quite appreciate the complexity of the problem. Richard Gill (talk) 07:34, 21 September 2011 (UTC)


 * Gosh, so many incorrect statements all at once. 1. No, the logical paradoxes of TEP have not been resolved. Who do you mean solved it and when? If you really think it's solved why on Earth are you planning to publish a paper with your own (almost) unique solution of TEP? I guess you mean that the correct solution is found in your yet to be published paper, right? Otherwise there would be no reason to write yet another paper about TEP, would it? Well, let's wait and see how your paper will be received before you or anyone else can claim that TEP is solved by your paper once and for all. If you mean that someone else already has solved TEP conclusively, please tell me who that is and when—in what publication—it happened. That important information need to be added to the TEP page. There is no consensus whatsoever that for example Chihara solved TEP conclusively 1995. You seem to be the only one to date that thinks along the same lines. And still, even you two differ in your opinions on how and to what extent your solution can be used, which is why I guess you still have a case for publishing your paper. Conlusion: TEP is not yet solved. To claim the opposite is to lie. 2. No, I have never said that TEP isn't about the real world. What made you think that? On the contrary I have explicitly said on this very talk page that "it happens to be that case that I can imagine the real world situation where someone has two or more envelopes on a table with unknown content. And that's all you need to be able to imagine to get the ball rolling. And it happens for real all the time, over and over again." I have even proposed to play TEP for real: "Of course it is a realistic game I proposed. If you want we can play it just for fun. I'm sure you will get the feel for how real it is." Conclusion: I have always said that TEP is about the real world. To claim the opposite is to lie. 3. No, it's not true that I lost interest in discussing the truncated Broome example. On the contrary you suddenly stopped talking about it. You have still not answered my ten questions from 17:03, 24 August 2011 (UTC) on this very page, all related to the truncated Broome example. This is over a month ago now. Conclusion: You suddenly lost interest in discussing this topic, not me. To claim the opposite is to lie. 4. No, to say that this is a situation "where everyone concerned needs to 'grow' their own understanding" is just nonsense. In your Unified Solution you only have two cases: either E(X) is finite or else E(X) is infinite. You don't have a third case where "everyone need to 'grow' their own understanding" and where your Unified Solution breaks down. If you want to have that please add that case explicitly to your Unified Solution. Right now you claim that TEP-2 can be solved using your Unified Solution, and the truncated Broome is an example of TEP-2. If you can't solve this example using the Unified Solution then please stop claiming that it can be solved using your Unified Solution, and explicitly state its restrictions. Being honest is simple as that! 5. No, it's not the case that TEP is solved "every day all over the world by insurance companies and individual actuaries" as a "big business" because it's a "big, complex science." That's a fantastic claim in itself, but even more so as you in the same breath claim that TEP is very simple and already solved. 6. No, I have never said that I want a two line solution of TEP. What we are dealing with here is your solution, coined the Unified Solution, that is indeed a two line attempt to solve TEP. First line: "Our beliefs about A are necessarily different when we know if it is larger than B, or smaller, or know nothing." Second line: "Our beliefs as to whether A is larger or smaller than B must be different when we know what A is than when we don't. More precisely, they will depend on how much it is and not always be the same." Disclaimer: "The only exception to these rules might occur if E(A) = E(B) = infinity. But then expectation values are no guide to decision anyway." So please don't accuse me for demanding a two line solution of anything. All I want to know is how your two line solution can handle TEP-2. To make the discussion less abstract I asked for your solution of the canonical example of TEP-2, which is the Broome example. If your two line solution can't handle the most canonical of all TEP-2 examples please just state that. Being honest is much more constructive than to deny the shortcommings and try to shoot the messenger. iNic (talk) 12:15, 2 October 2011 (UTC)


 * Gosh, so many misunderstandings! My opimion is that the logical fallacies of TEP-1, TEP-2, 2Neckties, Littlewood 2-sided cards, Schruödinger 2-sided cards are solved by my unified solution. Something completely different, is the problem that the expectation value of a probability distribution is a highly unstable functional of the dstribution. This is a practical problem which does not have a universal, simple solution. You proposed a certain truncation of the Broome problem but never answered my subsequent questions. How should a real casino implement the truncated Broome game for their clients? To begin with, how much money should they have in advance in the bank, and how much should they ask their clients as "entrance fee"? What is the expectation value of the total amount of money in the two envelopes under your proposed truncation? Should the casino have something like this amount in the bank, in advance? What happens if you change your truncation by a small amount? Should the precise truncation level change how the game is played in practice? By the way, Blachman and Kilgour (2001) already gave very sensible advice about how to deal with the fat-tails problem. Note their simulations of truncated versions of the game. See the wikipedia article on the St Petersburg Paradox, . On a different note, Samet, Samet and Schneidler (2004) already presented a "unified solution". Without using symmetry, they show that it is impossible that both A and B are statisticallly independent of the ordering between two random variables A and B, unless their ordering is fixed. My proof uses symmetry and shows that neither A nor B can be statistically independent of the ordering. Their argument is more complicated than mine, more generally applicable (does not assume symmetry), but does requires a small amount of further argumentation to get to the desired end-point in any of the famous applications (Original TEP, two necktie problem, two-sided card problem of Schrödinger, two-sided card problem of Littelwood). So there are two quite distinct issues: (1) what went wrong in original TEP; (2) the problems of fat-tailed distributions. Issue (1) is resolved, under most people's reading of the intention of the writer, by the "unified solution" which shows that if your prior beliefs about x, the smaller of the two amounts of money, are represented by a proper probability distribution, then your beliefs are whether or not b is smaller than a must necessarily depend on a. Issue (2) doesn't have an easy solution but on the other hand it is nothing special to TEP and plenty of people have written practical advice about what to do when confronted with this issue. Richard Gill (talk) 14:44, 5 October 2011 (UTC)


 * I agree completely that TEP and the St Peterburg problem (SPP) are two different problems. This is why they have two separate articles at Wikipedia for example. However, I don't agree that TEP for some values of E(X) magically transforms to become SPP. You say that I haven't answered your questions related to SPP. Well I have. But I can repeat my answer if you didn't get it, no problem. My answer is simply that TEP is not the same problem as SPP. I'd love to discuss SPP with you but that would be in a separate thread in that case. You have to prove that TEP and SPP is the same problem before you accuse me for not answering irrelevant SPP questions when I'm not talking about SPP but TEP. iNic (talk) 01:54, 1 November 2011 (UTC)


 * Thus it seems to me that the strict Bayesian paradigm (for decision-making under uncertainty) is not in any danger at all from TEP.


 * As you know I don't agree with you at all here, and I think that I have shown that. No need to repeat my arguments once more. But you are not alone in being wrong. I think I will write a paper where I carefully will show how and why all proposed solutions so far are wrong. Do you think that would be a good idea? iNic (talk) 11:13, 19 September 2011 (UTC)


 * Sure, I would be fascinated to read such a paper! In the meantime I will continue to write my own, when I find the time (now teaching term is at full pace so spare time is getting hard to come by). Richard Gill (talk) 11:19, 19 September 2011 (UTC)


 * Finally, one could imagine careless Bayesians who use rough approximations to probability distributions representing their beliefs before doing any calculations rather than doing it properly and exactly. Such Bayesians might use the improper uniform prior on log(x) as a matter of mathematical convenience. Or use the Broome distribution since it seems to them a good appoximation for quite a few values of x. Well: lazy Bayesians have now been warned that such distributions may give misleading results. A good approximation to a probability distribution (in the sense of well approximating all conceivable probabilities) may give lousy approximations to expectation values. All probabilists have known this for a long time. Lazy Bayesians should become less lazy, or they should consult an expert. Richard Gill (talk) 15:53, 18 September 2011 (UTC)


 * Now going back to two envelopes, according to Bayesian dogma, our initial beliefs about what is in the two envelopes is necessarily described by a Kolmogorovian joint probability distribution of two random variables A and B, symmetric under exchange of the two variables, and such that both are positive and one is twice the other. This is because we were told that the envelopes are first filled with (for us unknown) amounts x>0 and y=2x, and that we then chose one of the envelopes at random, and called it Envelope A. Our initial ignorance of what x actually is, must be summarized by the probability distribution of a random variable X. I define Y=2X and I define A=X or A=Y, independently of X each with probability 1/2, and B=Y or B=X correspondingly. The joint (Kolmogorovian) probability distribution of the random variables A and B logically must describe my initial beliefs about the actual amounts a and b in the two envelopes (assuming that I am self-consistent in assignment and updating of beliefs).

"Our initial ignorance of what x actually is, must be summarized by the probability distribution of a random variable X." Not at all. This is only the case for Bayesians. Please state this requirement explicitly in your paper. iNic (talk) 18:57, 15 September 2011 (UTC)


 * Of course. I had hoped this was clear from the context. I don't repeat the blanket assumptions explicitly, every single time I use them. TEP is a problem in Bayesian decision theory, right? Richard Gill (talk) 06:38, 16 September 2011 (UTC)


 * We can therefore in principle imagine how our prior beliefs about the content b of Envelope B would logically be adjusted if we were to be informed what actually is the content of Envelope A. Or how our beliefs about the contents of Envelope B would logically be adjusted if we were to be informed which envelope contained the smaller amount. This thought experiment can be performed whether or not we actually are going to be given extra information. That's the whole point of Bayesian decision theory. Because our prior beliefs follow a Kolmogorovian law and will always be updated, on getting new information, by Kolmogorovian conditioning, we can imagine what we would think (and how we would therefore act) if we were given certain information.

But you haven't explained how this solves the canonical situation using the Broome prior distribution as described above. Chihara is honest enough to admit that his version of your solution only applies to the case when no envelopes are opened. Exactly how would my "belief system" be changed when opening my envelope so that I would not benefit from switching to the other envelope? You have never been able to explain that. I want a real case application of your Unified Solution. You and Chihara are only talking about changing beliefs. Sure, I might change my attitude towards what's in the other envelope when I learn what I have in my envelope. But that observation is not enough. You or Chihara have to do the calculations that shows that there is exactly no point at all in switching to the other envelope. You also have to explain exactly when your calculations (the ones we are waiting for, not the ones you have provided so far) should be used instead of the ordinary expected value when making decisions. This recipe must consist of explicit rules for when your calculations should be used and when the ordinary E(X) should be used. All this before any of them are actually used; It is not allowed to say something like "When the E(X) seem to provide you with absurd results, try our new fancy E'(X) instead." In the general case we can't tell when a result is absurd, so we can't rely on that. Inversely, when we can't tell when a result of E(X) is absurd just by looking, we are really in need of such a rule. iNic (talk) 18:57, 15 September 2011 (UTC)


 * I have given you a short proof why our expectations (note plural! I refer to a whole probability distribution, not a single expectation value) about the content of Envelope B are certainly different on knowing which envelope contains the smaller amount, under the assumption only of finite expectation values. In the meantime I figured out how to drop this assumption. First prove the result in the finite expectations case. Now in general, choose any 1-1 monotone and continuous transformation g from the whole real line onto the interval (0,1). Apply it to both A and B (which don't necessarily have finite expectation values). The transformed random variables, call them C and D, form a symmetrically distributed pair of two random variables which are certainly different from one another and whose expectations are certainly finite. We conclude that whether C is smaller than D or vice versa is statistically dependent of C itself. But by applying the inverse of our transformation g we can conclude that whether A is smaller than B or vice versa is statistically dependent of A itself. Richard Gill (talk) 14:48, 13 September 2011 (UTC)

Please concentrate on finding a concrete case where your ideas can be applied before you start to explore generalizations. No one is in need of a general theory that can't be used in any single case. iNic (talk) 18:57, 15 September 2011 (UTC)


 * The Chihara paper seems to me to contain a lot of good sense, but also, an awful lot of words. (Fortunately the writer has an entertaining style. And he too treats TEP as a recreational puzzle, a brain teaser, but in philosophy, not in mathematics). The author is a philosopher and not a mathematician and has his hands tied behind his backs regarding his use of language. The subtle distinctions which he wants to make concerning how we name things, an important topic in philosophy indeed, are suddenly not so subtle but instead quite natural and obvious when we make an attempt to convert his story into the language of modern probability. Because of the limitations he sets himself he can only study specific examples, he cannot draw general conclusions. On the other hand, if one needs to explain TEP to philosophers and one decides not to teach them elementary probability theory first, then his way of doing it is probably more or less unavoidable. Because of the large number of words needed to deal with the problem satisfactorily it is easy for other philosophers to find fault with some of his notions or to claim that he still hasn't really shown what goes wrong ... he has shown what goes wrong, but it is difficult to see the wood for the trees. And he only shows what goes wrong in a single example - much like Kravtchik only gives one specific worked example. No-one seems to be able to formulate the general result. Yet (in retrospect, and with the aid of Kolmogorov) it is so simple. Richard Gill (talk) 15:16, 13 September 2011 (UTC)

I'm a bit surprised that you like this paper. If you asked me a while ago I would have thought that you would put this paper in the "woolly" category of papers. iNic (talk) 18:57, 15 September 2011 (UTC)


 * Now I come to think of it, maybe I did glance at this paper before, but it was so painfully slow and so verbose that I never bothered to find out what was in it. Now that I understand the mathematics of these paradoxes and know the issues of the philosophers and know their jargon, I am able to read the philosophy papers rapidly, and can easily find out if they have any interesting content or not. The Chihara paper contains a lot of good sense though it does use an exorbitant number of words to get across its ideas. One picture is a thousand words. A mathematics formula is a picture. Schitzgebel and Dever claim to be the first to explain "what actually goes wrong" in TEP but I don't think they are first at all. The mathematicians were before them, but unfortunately they can't read mathematics. Probability calculus was invented precisely in order that we did not have to spend thousands of words on silly paradoxes any more, but could move on and do some real science. I am still of the opinion that philosophers who can't do elementary probability shouldn't write papers on TEP. This whole group of problems was invented by mathematicians in order to tease non-mathematicians! Richard Gill (talk) 16:02, 18 September 2011 (UTC)

If this was originally meant as a joke to tease lay persons it has backfired completely. Very amusing indeed. iNic (talk) 17:22, 19 September 2011 (UTC)


 * iNic, i would be interested to hear your views on my proposed solution User:Martin Hogbin/Two envelopes. I too would like to see a simple solution, I just think the proposed one is too simple. Martin Hogbin (talk) 15:57, 23 October 2011 (UTC)


 * Your solution is too long and complicated to be true. Moreover, what you and many others miss (Gill included) is that a correct solution is not merely to pinpoint the erroneous step(s) but to formulate some rule(s) such that when following these rules similar mistakes will automatically be avoided. Such rule(s) must be clear and objective. The Gill rule that only people with a proper understanding of probability theory are allowed to even think about TEP is an example of such a rule that is neither clear nor objective. iNic (talk) 02:09, 1 November 2011 (UTC)


 * Gill's rules: (1) distinguish between probability coming from randomness (which envelope did you select) and probability coming from (lack of) knowledge (your prior beliefs concerning the smaller and larger amounts in the two envelopes) (2) distinguish between actual amounts of money (whether known or unknown) and possible amounts of money (3) distinguish between random variables and possible values thereof (4) distinguish between actual values of random variables, random variables themselves, and expectation values of random variables (5) distinguish between expectation values and conditional expectation values (6) realize that expectation values are only useful guides to behaviour when they are not altered much by truncating "unimportant" parts of the tails of a probability distribution. It's a historical fact that probability notation and probability calculus and probability language were invented precisely in order to be able to make these distinctions. Because of the Anna Karenina principle it's a delusion to imagine that some clear and objective rules would prevent such mistakes being made in the future. So far I am aware of 6 different and every one of them legitimate interpretation of the intention of the writer of TEP and for each of these interpretations different low-level mistakes are being made, but always one or more of the mistakes I just listed. Some time in the future someone could invent a new interpretation where possibly a new mistake is being made. First Anna Karenina bifurcation: are we using subjective probability about x as well as randomness as to whether a=x or a=2x, or only the second, "physical", randomness? Second bifurcation: are we trying to compute a conditional or an unconditional expectation value? That makes four interpretations so far, right? Third bifurcation: if we are using subjective probability, are we using a proper or an improper prior? I think that makes six interpretations now. If you read the history of TEP you must also study Schrodinger's two-sided cards problem and Littlewood's two-sided cards problem and Kraitchik's two necktie problem. These are all similar problems where similar mistakes are being made. Actually the oldest variants are explicitly using subjective probability with improper priors. It's a fact that familiarity with probability calculus makes it easy to understand them all. I would suggest that iNic takes a class in probability theory before telling us that my advice is stupid. After all, elementary probabilty theory is pretty easy. If you don't have an understanding of basic probability theory you can't read half the literature on TEP. No wonder you think then that the problem hasn't been solved. You are putting yourself in the company of all those writers who indeed cannot solve TEP because they do not possess a language which is rich enough in order to make the necessary distinctions. With regards to Martin's simple solution, which iNic thinks is too complicated: it is very simple in the sense of assuming the least possible sophistication on the part of the writer of TEP. It is a solution which various writers have also expounded (chiefly the philosophers). Unfortunately as far as I know this solution won't help you solve two neckties or two-sided cards, where Bayesian priors are explicitly involved, and where conditional expectations are explicitly being taken (not unconditional). Richard Gill (talk) 13:56, 1 November 2011 (UTC)


 * If I had to keep all these distinctions in my mind every time I crossed a street I would be dead by now. And yet you demand that all, literally all, have to think about these distinctions every time they think about probabilities or use the word "probable". TEP is a very simple problem. The correct solution of a simple problem is always simple. The six distinctions above that we need to keep in our minds according to you is not a solution, let alone a simple and clear one. Let's say I'm playing poker with some friends. I have a certain hand and some of the other cards in the deck have been revealed during play. I make some quick probability calculations of my chances in my head, based on some expected values. When doing this I try to keep your six distinctions in my head simultaneously. But is this really enough to make sure I didn't commit the TEP fallacy in some new way? I can't see how these things are related at all. And by the way, where is your Unified Solution in this list? iNic (talk) 01:05, 4 November 2011 (UTC)


 * If you followed a first course in probability theory carefully, all those distinctions would all come naturally to you. But we can also summarize the solution into one: make all assumptions explicit. TEP is complicated because different writers make different sets of assumptions. Philosophers, statisticians and economists each tend to imagine that the writer of TEP belongs to their world and makes the assumptions which come naturally tho themselves. They then find that he derails at a different place. My unified solution: that is the mathematical fact that proper prior beliefs about the amount of money in one envelope, can't remain the same when we know that it contains the smaller or larger amount, and conversely, that knowing that one envelope contains the smaller or larger amount, would change your beliefs about the amount in it. This is a short explanation of what goes wrong under any reading of the TEP with a proper prior on the amounts, or no prior at all (they are fixed e.g., without loss of generality, at 2 and 4 currency units). What goes wrong with long tailed distributions is another matter: that belongs to the interpretation of expectation values as guides to decision making. I have just been reading the three papers by (Nickerson and) Falk. She says *all* these things too. Let's face it iNic, TEP appears a simple problem but the literature proves that it isn't. Our task as Wikipedia editors is not to solve it ourselves but to survey the literature. If we are not able ourselves to make the distinctions which it is necessary to make in order to resolve the paradox, we are not qualified to be writing about it. Can the blind lead the blind? Richard Gill (talk) 08:32, 4 November 2011 (UTC)


 * It's interesting that you say that all these distinctions are taught in a first course of probability. That is not true. The first distinction is almost never mentioned. It's always mentioned in a first course in the philosophy of probability, but very seldom in a first course in probability calculus. If mentioned at all it is only to state the opposite: that it doesn't matter what philosophical interpretation one adopts, the probability calculus is exactly the same regardless. Incidentally, this was also for a long time your own opinion when you thought about TEP. You vigorously claimed that the philosophical interpretation of probability was totally irrelevant when thinking about and solving TEP. And then you suddenly changed your opinion about this. You even made this dramatic change of view explicit in the draft of your paper. Now it's apparently your distinction number one... Distinction number 6 is not mentioned in a first course in probability either. This is mentioned in a first course in decision theory, utility theory or mathematical economics, not probability theory. Probabilists usually only consider two cases, either E(X) is finite/defined or else E(X) is infinite/undefined. You are yourself not an exception to this rule. Modern probabilists doesn't attach any special meaning to E(X), like it would be morally good to maximize E(X) or even that E(X) describes a 'fair betting value.' They see it as just one of the functions that can be part of the characterization of a distribution, together with variance, moments of higher order, the characteristic function or moment generating function, and so on. Some classes of distributions doesn't have expectation and variance defined, but so what? This is not a problem for the probabilist as these distributions can be characterized in other ways. Distinction number 2 to 5 are indeed taught in a first class of probability calculus, but they are unfortunately irrelevant for solving TEP. This is easily seen by noting that TEP can be formulated without using a formal mathematical language at all. Smullyan did that in the most succinct way. iNic (talk) 11:32, 5 November 2011 (UTC)


 * In the courses I teach myself and in the course-books I use, distinction 1 is always mentioned. As far as the probability calculus is concerned, the distinction is irrelevant: the calculus is the same, at least for proper Bayesians, as most serious Bayesians are nowadays. In order to interpret the results in practice, one needs to be able to make a translation between mathematical model and reality. The frequentist and the Bayesian translation books are different. Unless of course one wants to allow improper priors - then one steps outside of the standard Kolmogorov framework. Distinctions 2 to 5 are essential for all the standard formulations of TEP. Smullyan's paradox is different, since by definition it is a version of the problem which does not refer to probability at all. Still, the solutions of Smullyan's TEP all revolve around the sin of equivocation. It seems you want there to be one magic principle which would enable one to solve any new variant of TEP which anyone ever invented or will invent. It would be: clear thinking! Be careful to distinguish things than need to be distinguished! Then the problems of infinite expectations or more generally of fat tailed distributions. These are important to anyone who wants to use probability in the real world.  NIckerson and Falk (2005) also cover this issue. The connection with Saint Petersburg paradox has been known for years. Wikipedia editors have got to summarize the literature out there, in an accessible way. The literature about TEP is all about resolving problems of equivocation. Probability calculus, concepts, and language were developed to resolve problems of equivocation. In my opinion, once we bear in mind the Anna Karenina principle, it is not so difficult to harmonize all the different solutions out there. This was also one of the main points of Nickerson and Falk (2005). Have you read it? It is not superceded by the two short papers by Falk and by Falk and Nickerson, which both focus on one common interpretation in the context of teaching elementary probability. Nickerson and Falk has much wider range.  Richard Gill (talk) 12:54, 5 November 2011 (UTC)


 * It's really great if distinction 1 is often mentioned in first course undergraduate texts, and not just in the statistics literature (or the 'translation books' as you call them). Can you please give a reference to one of the text books you have in mind here? I would love to see how this topic is presented. I for sure know what my duties as a WP editor are. To distinguish between personal opinions and what is stated in the literature is one of the most important ones. This is why I never ever reveal or discuss my own opinions at WP. However, I have made my sole exception to this rule here at your talk page only to be able to answer questions where my opinion is asked for. So yes, I do actually believe there is a 'magic principle' that solves all variants of TEP in one blow. I will explain that idea in my forthcoming paper. Yes, I have read the Nickerson and Falk paper in 2006 when it was published. I have read all the papers listed at the sources page, usually soon after their publication. iNic (talk) 02:12, 6 November 2011 (UTC)


 * At the moment I'm teaching from John Rice's book "Introduction to mathematical statistics and data analysis", and Wall and Jenkin's book "Practical statistics for astronomers". These are books on statistics which are both strongly grounded in applicatoins and for them the issue of translation is important. I don't call them translation books because statistics is much more than probability. It uses probability, sure, but also a whole lot more. These are books about Statistical Science, not books about (parts of) mathematical statistics. Notice the "and" in the title of Rice's book. I look forward to learning your "magin principle". Richard Gill (talk) 08:44, 6 November 2011 (UTC)


 * Thank you, but these are texts for statistics courses and not for "a first course in probability theory." That distinction (1) is mentioned when dealing with statistics is no news to me, which I also stated. What you promised me was a references that backed up your claim that this distinction was always mentioned in a first course in probability theory, not statistics. Remember that you said that "If you followed a first course in probability theory carefully, all those distinctions would all come naturally to you."? But never mind. Do you happen to have a good contemporary reference on how to handle the problems of fat tailed distributions? Didn't find much here. iNic (talk) 11:18, 6 November 2011 (UTC)


 * I suspect that most people's first course in probability is based on the first half of an introductory book on statistics, which is typically devoted to a "user's" course in probability. And the user needs to learn something about where the tools he or she is learning can be used. On the other hand, courses in pure probability theory for mathematicians are based on other books, which are usually only interested in the formal (Kolmogorovian) theory, which is neutral as to whether one wants to use it for subjective probability or frequentist probability. For fat-tailed distributions, I've started a new section (bottom of page). Richard Gill (talk) 13:37, 6 November 2011 (UTC)


 * Richard, where is the link to your new draft paper? Martin Hogbin (talk) 15:57, 23 October 2011 (UTC)


 * All over the place! But in particular: . Richard Gill (talk) 19:19, 23 October 2011 (UTC)

Quantum Two Envelopes Problem
Alice and Bob are located in Amsterdam and Beijing. Caspar, at Cape Town, can send them both information and indeed envelopes or packages (FedEx). I'll call them envelopes.

Alice, Bob and Caspar are playing a game again Xantippe and Yolanda who are also located at Amsterdam and Beijing, respectively.

A thousand days long, Caspar is to send daily envelopes to Alice and to Bob. Every day, Xanthippe and Yolanda are going to to toss a coin and report "H" or "T" to Alice and Bob. Alice and Bob are thereupon each allowed to inspect the contents of their daily envelope and thereupon write a number on each of their envelopes, surrendering it to Xanthippe and Yolanda.

Alice, Bob and Caspar win against Xanthippe and Yolanda if every time that both coins fall heads, Alice's number is larger than Bob's, but every other time, Alice's is smaller than Bob's. Otherwise, Xanthippe and Yolanda win.

By saying that Xanthippe and Yolanda toss coins I did not mean to imply that these would result in random, fair coin tosses. Actually all Xanthippe and Yolanda have to do is provide a daily "H" or "T".

Xanthippe and Yolanda are even allowed constant phone contact. Alice, Bob and Caspar are allowed daily telephone conferences. But they are totally isolated from one another between the time they receive their daily coin-toss, and the time they commit their numbers to the envelopes.

Q: Can they win?

A: Yes, see Richard Gill (talk) 09:37, 16 August 2011 (UTC)

Mathematical Typography
I'm playing with mathJax, it seems to me to be the way of the future, maybe. Here are two little examples.

$$\sqrt{\vphantom{I}} n \bigl(\hat\theta_{\text{MLE}}-\theta_0\bigr)~\Rightarrow ~ \mathcal N\,\bigl(\, 0\,, \mathcal I(\theta_0)^{-1}\bigr)$$

$$ \Pr(T_E\gg t_E)~=~\prod_{A\subseteq E}\,\,\prod_{s_A\in(0_A,t_A]}\, \Biggl(\prod_{B\subseteq A}\Pr\Bigl(T_{A\setminus B}\gg s_{A\setminus B}\Bigm|T_A\ge s_A\Bigr)^{(-1)^{|B|}}\Biggr) $$

The same formulae are on my university home-page, where they should automatically render in mathJax. Here on wikipedia, that will only happen for you if you are a wikiepdia user, are logged in, and had previously set up your wikipedia skin (preferences...) ... in the right way. It's not difficult, but hard to find out how to do it! I hope to add the instructions soon. Richard Gill (talk) 18:05, 22 August 2011 (UTC)

http://en.wikipedia.org/wiki/Haplogroup_J1_(Y-DNA)
Please review the map, the work of Tofanelli et al, Hassan et al, and comment in the discussions:

http://en.wikipedia.org/wiki/File:HG_J1_(ADN-Y)

http://hpgl.stanford.edu/publications/AJHG_2004_v74_p1023-1034.pdf http://ychrom.invint.net/upload/iblock/94d/Hassan%202008%20Y-Chromosome%20Variation%20Among%20Sudanese.pdf

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC384897/figure/FG1/ http://en.wikipedia.org/wiki/Haplogroup_J1_(Y-DNA)

http://en.wikipedia.org/wiki/Talk:Haplogroup_J1_(Y-DNA)

Essentially, the issue is whether J1 dominates in Sudan and the Caucasus at over 60%. John Lloyd Scharf 10:04, 23 August 2011 (UTC)
 * Well, what do reliable sources say? Richard Gill (talk) 16:37, 23 August 2011 (UTC)

Two envelopes again
Richard, a discussion has started concerning Falk's 2008 paper on this subject and how this should be represented in the article. I would be interested to hear your opinion. Martin Hogbin (talk) 11:53, 15 October 2011 (UTC)


 * Thanks, I will have a look. Have been extremely busy with other things for quite a while. This is on the main TEP article talk page? Richard Gill (talk) 08:59, 16 October 2011 (UTC)
 * Yes. Martin Hogbin (talk) 11:48, 16 October 2011 (UTC)

Heavy tailed distributions and TEP
This is a continuation of a discussion with iNic above ("mystery of Julius")

Try Googling "fat tailed distributions in finance", or "fat-tailed distributions in climate research" to find contemporary references on how the problems of such distributions are handled in different fields. Do it also with "heavy-tailed" instead of "fat-tailed". Why should their be one solution for all applications? Maybe you simply have to study each particular case on its own merits.

Suppose you and I play the TEP game. I fill the two envelopes by secretly tossing a coin with success chance 1/3 as long as it takes to get the first success. I'll put two to the power of number of failures before first success (0,1,2, ...) Euros in one envelope, twice that in the other, and let you pick an envelope at random. Suppose I have a capital of 1 million Euro's and I want to offer this game to lots of people. How much should I charge people to play the game? And how much should I charge for the opportunity to look in your envelope and decide whether or not to switch on the basis of what you see? These prices should be fixed so that on the one hand lots of people are attracted to play the game (like lots of people buy lottery tickets) while I make a steady profit. Should I take out insurance on being forced to pay out more money than I have? The first problem is the basic problem of every actuary: how to fix insurance premiums. The second problem is a problem of re-insurance.

Exercise: compute the first 20 probabilities (i.e. that the smaller of the two amounts is 1, 2, ..., 2^19 Euros). Note that 2^19 is a bit more than half a million, and the chance that the smaller envelope contains this amount is 15 in a hundred thousand. Small, but ... there is a 3 in a thousand chance that the smaller amount would be even more, ie, more than 1 million Euro's! It looks to me that I cannot offer the game at all, without truncating it quite heavily.

Now investigate the feasability of offering this game at a casino in some truncated form. For instance, let's investigate truncating it at 2^15 Euro's (smaller amount), by which I mean that if I need more than 16 coin tosses I start again, tossing this coin, and waiting for the first success. And if that fails within 16 tosses, I do it again....

The maximum prize is now 2^16 Euro, ie 66 thousand Euro. The expected amount of money in the envelope with the smaller of the two amounts is 100 Euros, in the other is 200, so I could charge say 300 Euros for playing the game once and make an expected profit of 150 Euro's per game. In view of tax and overheads and bonus and shareholders etc that's not unreasonable. But I think the game could look attractive to players. The ratio of maximum prize to entrance fee is about 200 and the chance of getting that maximum prize is much better than in a usual lottery. This could be a cool game in a high class casino.

OK so now we have a truncated version which is more or less realistic. Now change the truncation level say two steps up, and then again, but now two steps down. You'll find that produces dramatic changes in the charges which the bank should ask to play the game, and the attractiveness of the game to gamblers. Indeed, in whether or not we can come up with a game which is interesting for the casino because it is interesting to players while generating a good income to the casino. For all these truncated games it's the case that unless the player's envelope contains the larger amount, it's to his advantage - in terms of expected value - to switch to the other envelope. And the advantage is by a fixed ratio (except when the amount in the envelope is the smallest). Not as big as 5/4 but still clearly bigger than 1.

Moral: the untruncated game cannot be played. The results of analysis of the truncated game - for practical application - depend sensitively on where it is truncated. (And probability theory tells us that these features are not special to the particular example, but are generic).Richard Gill (talk) 13:40, 6 November 2011 (UTC)


 * Your reasoning is not correct for a number of reasons. You will understand why if you try to answer the questions I have posed to you that you never dared to answer. Anyway, I still haven't found the standard reference on how to handle fat tailed distributions, despite your excellent instructions on how to use the Internet. Being a multi million dollar industry, as you claim, there must be a standard text printed somewhere, right? iNic (talk) 00:26, 8 November 2011 (UTC)


 * I don't recall that there were any unanswered questions remaining from our earlier exchanges. At least, there was nothing new I could say beyond what I had said before. If my answers didn't satisfy you then that's too bad. And regarding those fat tailed distributions, my Google search produced several textbooks. So yes there are a number of standard printed texts. I suggest we drop this conversation now. I am looking forward to seeing your own ideas worked out. Richard Gill (talk) 10:13, 8 November 2011 (UTC)

OK I can help you remember. Below is what I wrote. I have made the questions bold so that you can locate them easier.
 * Sure I can assist you in computing a lot of things if that would help you. But what's the point doing that if you don't know how to interpret the results anyway?


 * I think I would have no trouble interpreting the results.Richard Gill (talk) 18:30, 9 November 2011 (UTC)


 * OK but why don't you just tell us how you think then? All you have said is that E(X) should be compared to "various quantities." You haven't revealed which quantities you have in mind nor what the conclusion would be if these quantities are larger or smaller than E(X). This is one of the other questions I have posed that you never answered. iNic (talk) 21:32, 9 November 2011 (UTC)


 * Of course it is a realistic game I proposed. If you want we can play it just for fun. I'm sure you will get the feel for how real it is. Want to play?


 * With real money? Richard Gill (talk) 18:30, 9 November 2011 (UTC)


 * Sure. iNic (talk) 21:32, 9 November 2011 (UTC)


 * If the original Broome distribution with infinite (unbounded) support is realistic (you never complained about that!), why would it not be realistic to use the same distribution but now with a finite (bounded) support?


 * It could be realistic. Depends very strongly on how you truncate. As I just illustrated. Richard Gill (talk) 18:30, 9 November 2011 (UTC)


 * OK so a narrow truncation leads to a realistic game. A truncation beyond a certain magical point throws the game into wonderland. And no truncation at all (or infinitely far away) and we are back in reality again? Is that how you think? If so, isn't this ontology quite strange? iNic (talk) 21:32, 9 November 2011 (UTC)


 * A narrow truncation leads to a realistic game. The further out it is truncated the less realistic it becomes. Infinitely far out is equally unrealistic to extremely far out. The important point I want to make is that narrow truncations at different points lead to quite different games. I illustrated that with actual computations on the Broome game. When we start deciding the rules of the real game we are going to play, the choice of truncation point is going to make a huge difference to the game. Richard Gill (talk) 09:47, 15 November 2011 (UTC)


 * OK, so no truncation leads to an unrealistic game and that is the reason to ignore this case. But why didn't you say that when explicitly dealing with this case in your Unified Solution? Instead you said this: "The only exception to these rules might occur if E(A) = E(B) = infinity. But then expectation values are no guide to decision anyway." Not a word about this case being totally unrealistic and that it therefore can safely be ignored. Instead you say that in this case expectations are not reliable. But if this situation can never ever happen who cares if E(X) is reliable then or not? Nor did you mention anything about the finite case becoming gradually more and more unrealistic as the expectation becomes larger and larger. Not a word. I really hope you will handle this in a more honest way in your article. iNic (talk) 04:33, 20 November 2011 (UTC)


 * Just because a case is unrealistic doesn't mean it can be ignored. It depends whether we are interested in solving mathematical problems within some standard mathematical framework, or whether we are interested in solving real world problems. Since TEP is a brain teaser about a sequence of apparently logical steps which leads to an obviously nonsensical conclusion, it should be solved be dissecting the logic, not by complaining about lack or real world realism.


 * Your view here is very interesting. It was a quarrel about exactly this view that sparked off the development of the theory of probability itself, over 350 years ago. Philosophical quarrels have a tendency to be very old so that isn't the interesting thing here. What is interesting is that at that time it was the mathematician, Blaise Pascal, that had my view while the non-mathematician, Chevalier de Méré, defended more or less your position. Pascal had been provoked by de Méré's view that mathematics has nothing to do with reality and that mathematics, in itself infinitely beautiful and perfect, could not in principle be applied to the finite, ugly and imperfect world we call reality. So when applying mathematical results to reality, said de Méré, we have to be very cautious, in particular when the mathematics we use involve infinities. To show that mathematics can go astray even in finite cases he referred to two problems involving games of chance, one of them invented by him. To prove that de Méré's philosophical view regarding the relationship between mathematics and reality was wrong Pascal solved both of the problems. His and Fermat's correspondence regarding these two problems became known and made other mathematicians interested in probability calculations in general, and the rest is history. Now, since the time of Pascal something very peculiar has happened to this philosophical view as mathematicians today oddly enough defend de Méré's position: we have to make a sharp distinction between "real world problems" and problems in "pure mathematics," and any application of mathematics to reality must be done cautiously, in particular when infinities are involved. Just as you say above. iNic (talk) 12:16, 26 November 2011 (UTC)


 * That's an amusing story indeed! I believe that de Méré was not such a good pure mathematician. And I thought that Pascal was mostly concerned with doing the mathematics well. But at least the Chevalier's scientific good sense seems to have been sound. Richard Gill (talk) 18:38, 26 November 2011 (UTC)


 * Yes it's funny how history repeats itself, but now with the roles swapped! Sure, de Méré wasn't a good pure mathematician (he never claimed he was) but he was an experienced gambler, and he was a far better mathematician than Pascal and Fermat were gamblers. Pascal failed to solve and even grasp the philosophical problem de Méré wanted to illustrate in his first example with the dices. And both Pascal and Fermat solved the second problem about the fair stake in the wrong way because they were both math heads and not gamblers. We still live in the aftermath of these errors today. iNic (talk) 03:30, 6 December 2011 (UTC)


 * And I am sorry that you did not have the imagination yourself to come up with the pretty obvious observations which you just made, and instead were angry with me for not mentioning them earlier. It seems to me that you ought to be taking some time out studying elementary probability theory in order to build up your own intuition about these things. Do that first and come back to TEP later. You still haven't said what you think is wrong with my calculations on the truncated Broome game. Do you not agree with the numerical results? Do you have problems with the conclusions I draw about them? Do your homework first, complain after, supporting your complaints with proper documentation. Richard Gill (talk) 20:23, 20 November 2011 (UTC)


 * Your numerical results are fine but are they relevant? I don't think so and my plan is to try to prove it by simply play the Broome game with you. My hope is that a direct real world confrontation with the game will show you that your calculations are irrelevant. To take an analogy from history: When Galileo Galilei tried to convince the men of the Church that Jupiter had moons, he invited them to have a look in his telescope so that they could see the moons with their own eyes. But they refused. Why? According to the Bible it was clearly impossible for Jupiter to have moons, so it was absolutely no point in looking in the telescope. What is true a priori can't be refuted by observation. In the same way you claim that it's impossible to observe a Broome game for real because that would contradict the theory. iNic (talk) 10:46, 27 November 2011 (UTC)


 * I do not claim that it is impossible to observe a Broome game for real. What we will see will depend on the rules we agree on, and also very much on how many times we play. Just playing once (whether or not the game is truncated) will prove nothing, with very high probability. With tiny probability we will see that one of us has to break the rules we had agreed on. Unless we set the rules with such heavy truncation that there is no way either of us has to default. In which case it is hardly worth playing the game at all since a computer simulation is just as good. When we do it lots of times, we will start to learn empirically whether a particular set of rules can be played for real. Richard Gill (talk) 11:17, 27 November 2011 (UTC)


 * OK so now suddenly we have to play many times? Yet again a new parameter thrown in just like that. The number N of games to play. You never talked about that before. The expectation E(X) is exactly the same no matter how many times we play. The same goes for utility E(U). E(•) is not a function of N, which you ought to know. You seem to have some other concept in mind here. But which one? Decision theory in general, as well as TEP, is explicitly concerned with the case when we have a unique situation. Most decisions we have to make is unique and after the decision reality is changed so much that a repetition is impossible or not even imaginable. Take the case when deciding whether to go to war or not. Not a situation that can be repeated even twice. Anyway, even putting that big misunderstanding aside your extra requirement is inconsistent unless you claim that all uses of distributions with infinite support can't be used in a single case, and that we have to truncate the tail(s) and repeat the situation so many times as we will detect the truncation statistically. Is this what you teach your students when applying for example a normal distribution to a real case? If you do, that is just crazy. If you don't, then you are inconsistent. iNic (talk) 03:30, 6 December 2011 (UTC)


 * It's something inherently irrational to complain about non-realism only when something infinite has become finite! If you don't like big numbers you should hate infinities, not love them. You can't be cool when confronted with infinities and get upset over some finite big numbers. What kind of finitism is that? What is it called?


 * I don't get upset and I don't complain. This is nothing to do with "finitism". Richard Gill (talk) 18:30, 9 November 2011 (UTC)


 * True, this is not finitism at all. But what is it? This view for sure deserves a name. An appropriate name would be "Expectation Finitism." This is the view that when E(X) is infinite we must restrict mathematics itself to some arbitrary but finite part of reality in such a way that EX) becomes finite under those restrictions. The rationale for this is that "reality is finite". But in all other cases when E(X) is finite from the outset no restrictions to any arbitrary finite part of reality is required. Now suddenly reality isn't finite at all. So this is thus a view that is inherently inconsistent. Despite of this it's a quite common view, even today, when discussing SPP. iNic (talk) 21:32, 9 November 2011 (UTC)


 * This is not my point of view. Richard Gill (talk) 21:06, 20 November 2011 (UTC)


 * If your main argument against the bounded Broome is that it's unrealistic, why didn't you say anything about this when discussing the infinitely more unrealistic unbounded Broome?


 * I don't waste time saying things which are completely obvious. And I don't waste time answering silly questions. So silence is an answer. It means: do a bit if thinking for yourself, and reconsider the question. It's the answer of a Zen master when his disciple asks foolish question. Richard Gill (talk) 18:30, 9 November 2011 (UTC)


 * What is completely obvious to you might not be completely obvious to me, and vice versa. That your lack of answers to my questions meant that you actually did answer my questions as a true Zen master was for sure not obvious to me. For me this is not a silly question, on the contrary it's the heart of the matter. So you say that it's completely obvious that the unbounded Broome is unrealistic? Well in that case all distributions with infinite support are unrealistic for the same reason. Most of the distributions here has infinite support, including most of the distributions specifically made for use in physical theories. You still think it's a silly question? iNic (talk) 21:32, 9 November 2011 (UTC)


 * Whether or not a probability distribution with infinite support is useful in practice depends on what kind of use you want to make of it in practice. Presumably your aim is to deduce some practical consequences. Do those consequences depend heavily on what is going on in the extreme tail of the distribution or not? Are they totally changed when you truncate the distribution somewhere far out in the tails? Do the conclusions depend very much on where exactly you truncate, far out in the tails of the distribution? This is something which can be investigated anew for each new application. You could call it sensitivity analysis. Do the conclusions of mathematical modelling depend strongly on features of the model about whose real-world counterparts you know nothing? Sometimes it matters, sometimes it doesn't. If the property of the probability distribution which is important for you is its expectation value, and if the theoretical distribution has infinite expectation, then different trunctations will lead to distributions with vastly different but finite expectations. You seem to have difficulty appreciating this rather elementary mathematical fact. Richard Gill (talk) 20:34, 20 November 2011 (UTC)


 * But this is exactly the Expectation Finitism I describe above, a view you said was't your view. This inconsistent philosophical view is in no way an "elementary mathematical fact." iNic (talk) 02:17, 28 November 2011 (UTC)


 * I call this common sense. It is not a philosophical view. My philosophical view is something completely obvious: models are never true. At best they are useful approximations. Whether or not a model is a useful approximation depends not only on the model itself and on reality itself, but on the purpose you use it for. "Fit for purpose". Nowadays the trendy name for finding out if a model is :"fit for purpose" (itself a trendy word) is "sensitivity analysis". That's what I'm talking about. It does not reflect any philosophical view of the meaning of infinity at all. It reflects a practical view to using mathematical models in practise. Understanding long tailed distributions: how about the following paper: Vol. 26, issue 3, 1972 of Statistica Neerlandica, "Understanding some long-tailed symmetrical distributions" by W. Rogers and J. W. Tukey, published there on pp. 211-226. My general recommendation is to draw the distribution marking the locations of mean, and of interesting quantiles. You want to find out where the mean lies in relation to where the bulk of the probability is. I recommend use of R (www.R-project.org). Rogers and Tukey give a lot more wise advice on the same lines. Richard Gill (talk) 11:00, 28 November 2011 (UTC)


 * To me it's not common sense at all to treat one distribution with infinite expectation in a completely different manner than another distribution with finite expectation, despite the fact that the difference between the two distributions can be made as small as one desire. In practice there are no difference at all between two distributions like that, if turned into games. This inconsistency can't be explained away by the generic assertion "all mathematics is just models." If all math is just models why then this inconsistent difference in approach towards models that are for all practical purposes the same? iNic (talk) 03:30, 6 December 2011 (UTC)


 * You have always since you started to comment TEP had the opinion that infinity should not be a problem in itself. (I agree with you here 100%.) All mathematics is full of it, both in theory and for making realistic models of the world. This you said over and over. And now you have suddenly become a very harsh finitist it seems, where even modestly large finite numbers should be banned from the realm of the thinkable. But the problem remains: Decision theory claims that E(X) should be our guide when making decisions. The St Petersburg paradox, that you now suddenly refer to as relevantt (you never talked about this paradox in your explanations of TEP before), brought to the table the idea that it's the expected utility, E(U), and not E(X), that is the true guide for decisions. Reason is that this idea solves some basic versions of the St Petersburg paradox. But how will that idea help you to solve TEP?


 * Switching from money to bounded utility puts us in the situation of elementary TEP which has already been resolved. Richard Gill (talk) 18:30, 9 November 2011 (UTC)


 * No, this is not correct. You need bounded support to claim that TEP has become something "which has already been resolved". But bounded utility does not imply bounded support. On the contrary, utility theory started off as a mere mathematical way to solve SPP, without having to rely on arbitrary ad hoc truncations of the support. Utility doesn't solve a single version of TEP. Utility theory doesn't even solve all versions of SPP for which it was made. Quite a useless theory if you ask me. iNic (talk) 02:18, 10 November 2011 (UTC)


 * Bounded support makes the resolution of TEP easy. The realisation that utility is not linear in money, and that in fact utility is bounded, makes the resolution of TEP again easy. Richard Gill (talk) 09:47, 15 November 2011 (UTC)


 * Bounded utility is a desperate attempt to solve SPP, but it can't solve TEP. In TEP the situation is totally different. You are not asked for the fair price for a certain game. Instead, money (or whatever) is given to you and all you have to do is to decide which gift to take. What's given away doesn't matter. Be it money, utilities, neckties, bananas, ... iNic (talk) 04:33, 20 November 2011 (UTC)


 * Not true. In the real world people do make daily choices between getting bananas, neckties or money. According to economic theory they do this according to their personal utilities for the various goods and according to their subjective probabilities of the consequences of their actions. And on the whole economists agree that for practical application of economic theory utility can be taken to be bounded. Bounded utility is not a desparate attempt to solve SPP. It's an eminently sensible way to resolve it. Bounded utility is an eminently sensible way to solve TEP when considering it within the framework of decision theory. For instance, we play the untruncated Broome game. The two envelopes contain cheques for the two amounts of money. These amounts could be arbitrarily large. At some point, when you take such a cheque to the bank they'll laugh at you and not give you the money. (And in the meantime I have disappeared). So the value of the cheque to you is not exactly proportional to the number of dollars written on the cheque, but it first increases more or less linearly and then flattens out. In fact, it might even start decreasing again and eventually fall close to zero, since all cheques written for astronomically large amounts of money are worth about nothing. Richard Gill (talk) 20:47, 20 November 2011 (UTC)


 * OK if you think that utility theory is flawless you can't have heard of the Allais paradox or the Ellsberg paradox. TEP is just another paradox that utility theory can't handle. Utility functions are monotonic increasing functions with infinite support. This follows from the properties the inventors wanted the utility function to have. So your example with a cheque that becomes worthless is again an argument for how bounded support can solve TEP, not an argument for how utility theory could solve TEP. Utility theory (or "moral expectation," as it was called the first 200 years) was invented to solve SPP in a way that didn't introduce any unsatisfactory, arbitrary, ad hoc limits. If I ask you how big the largest possible cheque is you wouldn't be able to tell me that. Without that information this rule and all similar rules are worthless in practice. iNic (talk) 03:30, 6 December 2011 (UTC)


 * That idea is useless here. You seem to claim that situations leading to TEP should not be possible to even imagine, and things that you can't imagine doesn't exist, and what doesn't exist we can safely ignore. But it happens to be that case that I can imagine the real world situation where someone has two or more envelopes on a table with unknown content. And that's all you need to be able to imagine to get the ball rolling. And it happens for real all the time, over and over again.


 * I never said situations leading to TEP can't exist. I think we have already resolved the problem of identifying the wrong reasoning in the paradoxical deduction that you must switch and switch and switch ... Richard Gill (talk) 18:30, 9 November 2011 (UTC)


 * OK so it's only in the situations where we don't have a solution that we say that these situations can't exist? iNic (talk) 21:32, 9 November 2011 (UTC)


 * No. Not *only* in those situations. Don't be silly. It *is* true that if A implies B, and if B is false, then A must also be false. So if some imagined situation leads to something impossible then that situation indeed cannot exist and therefore there is no point in worrying about the apparent illogical conclusion.


 * Exactly. Observing a moon around Jupiter (A) implies that the Bible is wrong (B). But as it is impossible that the Bible is wrong we know a priori that any true observation of a Jupiter moon cannot exist. iNic (talk) 01:45, 10 December 2011 (UTC)


 * But I am not saying that this principle need be applied to TEP. In my opinion, for every context which anyone has imagined for TEP, and for every interpretation of the writer's intention within such a context which anyone has imagine for TEP, there exists a well known and simple resolution. From the mathematical point of view there are only three resolutions and they correspond to three different ways of looking at my "unified solution". On wikipedia our job is merely to report what is out there in the reliable sources so I think we are now well set up to do exactly that. Richard Gill (talk) 20:57, 20 November 2011 (UTC)


 * You say that "There is no generic answer..." which is nothing else than giving up the idea of finding a general guide of what do do in different situations. Do you mean this is true in general? For sure, in some situations E(X) is the correct guide for how much to bet, isn't it? Or do you really claim that E(X) is never a reliable guide? If E(X) is never reliable and there is no other concept that can replace it (other than the very strange mathematical principle "consult a professional") then what you are saying is in effect that decision theory is dead. Is this your opinion?


 * The general guide of what to do in different situations is to use your brains. So far, mine did not let me down. Decision theory is a mathematical model for making choices under uncertainty. My utility of a cheque written by you for N dollars certainly isn't linear in N. I find decision theory useful as a mathematical framework. Sometimes it is easy to apply, sometimes hard. Richard Gill (talk) 18:30, 9 November 2011 (UTC)


 * OK as I read you you say that utility theory is good in many situations but it has its limitations, sometimes of theoretical and sometimes of practical nature? OK then and welcome to the game! The game is to construct explicit rules for when decision theory is safe to use and when it can't be applied. That your brains works flawlessly in this respect is great but not of a lasting value to humanity, unless you can explain how your brain is wired. iNic (talk) 21:32, 9 November 2011 (UTC)


 * That may be your game, but it is not my game, and certainly no part of the game of writing a wikipedia article on TEP. Maybe you can better spend your time by going away and developing your world-shattering new theory of decision making under uncertainty. And it will be fun to see what it has to say about TEP. In 100 years time we lesser mortals will write wikipedia articles about it. Richard Gill (talk) 21:01, 20 November 2011 (UTC)


 * It is already developed. I only have to find someone willing to publish it. iNic (talk) 03:30, 6 December 2011 (UTC)

You have ignored these questions ever since I stated them on August 24 this year.


 * I thought that the questions were stupid or the answers were obviou, so I ignored them. .Richard Gill (talk) 18:30, 9 November 2011 (UTC)


 * Haven't it happened to you that you have put your keys somewhere but forgot where. You search everywhere only to find, at last, that it they all the time were at the most obvious place? Sometimes the greatest rewards are hiding at the most obvious places. In this case asking the "stupid" questions is like searching for the lost keys on the kitchen table, and find them there. iNic (talk) 21:32, 9 November 2011 (UTC)


 * Maybe, maybe not. And questions are more important than answers. Good luck with your own search for your own lost keys. Richard Gill (talk) 21:07, 20 November 2011 (UTC)

So it's not the case that your answers didn't satisfy me. The case is that I never got any answers that could either satisfy or dissatisfy me. If you are interested in my ideas you should be interested in my questions. If you understand the relevance of these questions you will understand the relevance of my solution. In fact, the answers to these questions lead directly to my solution. But if you don't accept the relevance or importance of the questions above you will not understand or find my solution interesting either. Why the reluctance to reveal the title and author of a single good standard text about fat tailed distributions? Is it controversial which book to use? iNic (talk) 17:11, 9 November 2011 (UTC)


 * I don't know if I'll be interested in your ideas. I haven't seen them yet. I'm interested to find out what they are. That's a different matter. When I googled fat tailed distributions in finance I immediately found a text-book by a guy called Rachev. It should be good. I'm not inclined to go enormously out if my way fir you since you've been consistently abusive since we started communicating. But I'm always ready to let bygones be bygones. Richard Gill (talk) 18:30, 9 November 2011 (UTC)


 * Thanks for the reference, I will check it out! I think that we are making a lot of progress now! Sorry, don't mean to be rude. But sometimes the emperor need to be stripped naked before he's even interested in new clothes. And for the emperor to slowly realize that he has been naked all the time must feel about the same. iNic (talk) 21:32, 9 November 2011 (UTC)


 * I'm not an emperor so I wouldn't know. Richard Gill (talk) 21:09, 20 November 2011 (UTC)

OK no problem. Please take your time to ponder upon these questions. iNic (talk) 11:46, 14 November 2011 (UTC)


 * To my mind, there is no longer any mystery in TEP. It's a wrong argument within decision theory. We understand why it s wrong, we know how to correct it. We know the correct answer. A completely different question is whether or not decision theory is useful in real life, whether it is easy to apply or prohibitively difficult. I have no vested interest here, no strong opinion. So having pondered your questions, iNic, I am at a loss where you are trying to go. I don't want to continue an endless discussion on decision theory. It doesn't interest me much. I don't see anything more to say a out TEP, except perhaps to play a real truncated Broome game. But I've already written out an analysis of different truncations. And shown how the value of the game depends extremely sensitively on the truncation level. That's a fact of life. Richard Gill (talk) 01:32, 16 November 2011 (UTC)

If you had solved TEP you would be able to answer my questions easily, but you are still avoiding most of them. (And I don't care if you are avoiding them as a true Zen master or not, in my view you are simply avoiding them.) Decision theory is all about real life. If you have a theory that can't be usefully applied to real life you don't have a theory either. It's like physics. A physical theory that can't be applied to real situations is not a physical theory. Such theories are instead classified as metaphysics, theology or some other non-science, not physics. As decision theory want to be a science it has to be appliccable to real situations. On the contrary. It's a fact of life that the value of the game does not depend sensitively on the truncation level. As soon as you play the game with me you will realize that too. iNic (talk) 04:33, 20 November 2011 (UTC)


 * On your last point we disagree strongly. I know for sure that the truncation level will have a big effect on the feasibility of the game, and given its feasibility, on the amount of money I'd be prepared to pay (or to reserve in advance) in order to play it with you. I showed this explicitly by some calculations which you till now have ignored. Were you not able to reproduce them, or did you not understand them? Now you seem also to have a second theme. Let's agree with the statement that TEP is a paradox within decision theory. I claim that within decision theory it is solved. For any of the ways in which the writer has been interpreted in the past, and only a small number of interpretations keep cropping up (whether recognised by the authors to be old, or not), we understand where the reasoning breaks down and we know how to correct it. But you seem to be saying (a) that TEP is still not solved, and (b) that the reason it is not solved is because of some deep failure of decision theory. In that case you are pretty much a voice in the wilderness, since I don't know any other writers who say this. And you say yourself that you are not an expert in subjective Bayesian decision theory so maybe you understand less about the theory than you think. My point of view is that decision theory as a mathematical theory is interesting, but it is neither prescriptive nor descriptive of real people's real behaviour in real life, except perhaps at such an abstract level that it is always true as a description because we can always choose parameters so that it fits to any actual behaviour. But again, that's a personal point of view, and I am not a specialist in economics or decision theory, so I do try to keep an open mind on these issues. Have you read "the predictioner's game" by Bueno da Mesquita? Rather good, I think. It shows how decision theoretic thinking can help one to analyse complex human conflicts and sometimes come up with wise advice on how to act. Richard Gill (talk) 10:27, 20 November 2011 (UTC)

You are contradicting yourself (once again). Here you say that you "know for sure that the truncation level will have a big effect on [...] the amount of money I'd be prepared to pay," Further up however you say about exactly the same game that "Just playing once (whether or not the game is truncated) will prove nothing, with very high probability." It can't be the case that a truncation for sure has a big effect on the value of the game and at the same time can't be detected at all with very high probability. Yes it's true that I'm a voice in the wilderness. If I understand more or less than others regarding this only the future can tell. But as long as no one can answer the very basic questions I have stated here I will repeat them. I have read a sufficient amount of books about bayesianism and utility theory to know the main themes. But I'm not an expert in applying the theories. I find that utterly difficult. I do agree with you on one point. TEP is solved. But until I or someone else who have found the true solution make that solution public all published solutions so far are unfortunately wrong. Thanks for the tip about the book. I will check it out. I'm reading the paper by Rogers and Tukey now. iNic (talk) 03:30, 6 December 2011 (UTC)

First or second swap of envelope?  (odd number or even number?)
Pardon for intruding again. I see now that my original apprehension of the TEP was completely incorrect. I thought the paradox was a consequence of an "incorrect result of 5/4A". But now I see that this result of 5/4A is fully correct, given that you have never changed before, i.e. that you are considering a FIRST move. Only then you can expect to gain 1A or to loose 1/2A. That means that you have no closer knowledge about the contents of envelope A nor of enelope B. Only under this condition that you have no closer knowledge whatsoever, B will contain 5/4A. But if you should have already swapped for the FIRST time from envelope A to envelope B (still without opening any of them, you can already bet to be holding 5/4A now), then you have gained additional knowledge about the contents of A and B, for after having already swapped from A to B you exactly know now: You can never return to envelope A hoping to increase the amount again, because now you will loose money. Any 2nd or 4th and so on "even-number swap" will start from OHTER preconditions: You know now that you either hold 1/2A and then you can only expect to double "1/2A" to 1A again, or that you are holding 2A now and then you will be losing 1A and in that case also return to 1A.

Odd-number-swap:

1/2(2A) + 1/2(A/2) = 2A/2 + A/4 = 5/4A can be expected.

Even-number-swap:

1/2(2A/2)+ 1/2(2A/2) = A/2 + A/2 = 1A can be expected.

So for any "even-number-swap" you will return from 5/4A to 1A, and you can only expect to get 5/4A in any "odd-number-swap". Is that right? Can you express this in a "better theorem" that regards whether you "know nothing" about the contents of the two envelopes, as it is the case for the first or third or fifth (odd-number) swap, and that pays regard to the fact that you know exactly that for any second, fourth or sixth (even number) swap you have entirely other presuppositions? No more "unknown" starting point, but knowing exactly then that you can only "double 1/2A" and never more, returning to 1A, but you are risking to "loose 1/1A" but never less, also returning to 1A. Can you say that in the theorem? Gerhardvalentin (talk) 01:55, 11 November 2011 (UTC)


 * If you were so colossally ignorant of the amount of money in the two envelopes that you would consider all of the doubly infinite number of amounts ... 1/4, 1/2, 1, 2, 4, 8, ... equally likely to be in the envelope containing the smaller of the two amounts, then whatever amount a (some positive or negative power of two) you would imagine being in Envellope A, you would correctly judge that Envelope B would be equally likely to contain 2a or a/2. And vice versa! You would imagine gaining an average of a/4 by switching, whatever you might imagine a to be. The same remains true after you have switched, now regarding b. (I'm assuming you don't ever look in the envelope). It appears that your expected final gain increases on every swap. However this is not true since it started off being infinite and remains so at every swap. At every swap your probabilities of what are in your current envelope, whether a or b,  remain: equally likely ... 1/4, 1/2, 1, 2, 4, ,8, .... The point is that whenever you stop switching, and actually look in whatever envelope you now have, you'll be disappointed, since the amount will be finite, but its expected value is infinite. You gain no information by switching closed envelopes, Your initial beliefs about the contents of A and B are symmetric under exchange, and hence unaltered by exchange..  Richard Gill (talk) 09:05, 11 November 2011 (UTC)
 * Thank you for your comments, Richard. But sorry: You say  "you gain no information by switching closed envelopes". But in advance I know tha "A" will be some yet unknown but unchangeable amount, whereas "B" in any case can never be any "unchangeable amount". It forever will be either Y (2A) in 1/2 of cases, or at the same time it will be only Y/4 (A/2) in the rest of cases. That makes a great difference, and that's what I learn by switching closed envelopes without opening any of them. I learn that by switching from A to B i will gain A/4, but never vice versa. The same remains NOT true after I have switched, now regarding "B". Regards, Gerhardvalentin (talk) 16:07, 11 November 2011 (UTC)


 * I disagree. You must distinguish between the actual, unknown, amounts in the two envelopes, and the state of your knowledge about them. The amounts in envelopes A and B remain forever a and b. As you exchange one envelope for the other the amount in your hand switches from a to b then back to a and then b again, and so on. One is the smaller amount x and the other is the larger amount y=2x. Our knowledge about what is in both envelopes remains the same. It is built up from two components: our initial beliefs about what x might be, and the fact that envelope A contains x or y with equal probability 1/2, independently of what the amounts themselves might be. Richard Gill (talk) 17:12, 11 November 2011 (UTC)

First or second swap of envelope
"A" is master, "B" is slave.

Thank you, Richard. And I say: "No never ever", and "yes". Yes, it's about our "knowledge", and it's on the "relevant amounts". And it's quite simple. The question is about changing from envelope "A" to envelope "B", expecting on average to gain 1/4A and to have 5/4A then. And this question is already solved. But there's a second question that is about "changing back from envelope B to envelope A"  -  Can that 5/4-theorem be applied again? And the clear answer is "never ever". Fullstop.

Because there is a great difference, as I said above, yes it is: envelope "A" is free in any respect, to be what it might be, and to be doubled or halved, WHATEVER it ("A") might be. – Whereas "B" forever is quite restricted and bound. We know that "A" can be what it may be, and you can double "A" or halve it with equal probability by changing to "B". Whereas "B", at the first glance, also looks "as if" it could be what it may be, and can be doubled or halved with equal probability. But it's about "knowledge" and "relevant amounts", so you should be careful:

"B" cannot be "simply doubled" or " simply halved" equally likely like "A", with equal probability. No, that's the privilege only of "A".

On average, "B" will be 5A/4, let's call that B(average), like "A" on average will be 4B/5.  "B" could be at max 2A, let's call that B(mayor) = 2A = [8B(average)/5]. And "B" could be at min A/2, let's call that B(minor) = A/2 = [2B(average)/5].

You can double "B" only in the case that it is the smal amount of B(minor) resp A/2, OTHERWISE NOT.  Only if B is A/2 you can double that small amount, yes, gaining A/2 and getting A then. In other words you can double only the small amount of B(minor) by B(minor) getting [4B(average)/5], but you never can double the amount of B(mayor). And you risk to halve "B" for sure if it is the considerable amount of 2A or B(mayor) of [8B(average)/5] and loose then the amount of [4B(average)/5], getting [4B(average)/5] then. A really considerable loss indeed.

''"You pick up one end of the stick, you pick up the other" – Where they slipped in, there too must they go out. The first is free, the second's slaves are we.''

In any game, only "A" can be said "to be free to be whatever it will be", whereas "B" forever will be "the slave of A". This makes the difference, as I said above. Once more:

B is not "anything". B is, and forever will be, 5/4A: either 2A say B(mayor) or call it [8B(average)/5) or B will be A/2 say B(minor) or [2B(average)/5) with equal probability. Fullstop.

If A is 4  e.g., then you can halve B only in case it is considerable 8, otherwise not, even if you might say "equally likely". And you never can double "B" if it is considerable 8. Never. You can double "B" only in case it is 2, in this example. Once more: "B" can be doubled if and only if "B" is 2 = 2B/5, otherwise not, and you never can halve it in the case that it should be 2.

The "5/4A"-theorem doesnt need to pay regard to any statistical points of view, as it's about changing the first time in any game, from "A" to "B". Then the theorem is fully correct, but only correct for any "new game" when you have a new envelope "A" in your hands. Then switching to "B" will give you on average 5/4A. And that's the END of the validity of the theorem. Okay?

It is quite naive to believe that you can likewise say "A=5/4B". That's nonsense, as "B" is the slave of "A". "B" cannot be "doubled of halved, whatever it is". That does not apply. That's the privilege of "A". Believe it. The phrase "whatever it may be" applies only to "A", but "whatever it may be" NEVER applies to "B".

As "B" forever is the "slave of A", forever being 5/4A, either appearing as "2A" or as "A/2" with equal probability, you are bound to say: B can only "double" in the case that B is tiny "A/2", otherwise not, and B can only "halve" if it is quite "another value", say only in case that B is considerable "2A", otherwise not.

Once more: otherwise not. It's quite easy, isn't it? – "A" forever is free to double or to halve, whatever A may be, but "B" will always be the slave of A, no matter whether you like to change to "B" or not. "B never is free", it's always "A's slave", it is 5/4A.

If B is 2A, then you never can double it, even if you should like to, that's solely the privilege of "A" to be able to double in any case – '''Then you only can halve it. And if B is A/2, then you never can halve B, even if you should like it, then you solely can double B. B is not free, B is a quite restriced slave of A.'''

In this respect, "B" is the "crux" of the theorem. You just have to look what the counterpart of B will be: it always will be "the master of B called A".

"B" – on average – is 5A/4, as likewise "A" – on average – is  4B/5.

Meaning that in 1/2 of cases "B" will be B(mayor) = 2A = 8B/5 and then in swapping a second time it will always be halved by considerable 4B/5 to 4B/5, and in the other 1/2 of cases "B" will be B(minor) = A/2 = 2B/5 and only then by swapping again will be doubled by tiny 2B/5 also to 4B/5. So with equal probability you'll get 4B/5 or 4B/5. That's it.

So the only correct theorem for switching from "B" back to "A" (even-number-swap) is, like I already said above:

E(A) = [(1/2 x 8B/5) /2] + [2 x 2B/5) /2]    or    (1/2 x 4B/5) + (1/2 x 4B/5)  =  4B/5.

So  "A"  forever will be 4B/5 but never "5B/4", as some regardless "perpetuum theorem" mathematicians could be mistaken to be saying. Kind regards,  Gerhardvalentin (talk) 17:58, 13 November 2011 (UTC)


 * I disagree. Neither of the two envelopes has any privileged status. As long as we do not look in either envelope, our simultaneous beliefs about what might be in either envelope is unchanged on switching the two. Any number of times. Remember, first of all two envelopes are prepared, containing two amounts of money, one twice the other. Then we pick one at random and call it A, we call the other B. Now switch them many times. Call the one now in your hands C, call the other one D. Our beliefs about the contents of C and D are identical to our beliefs about the contents of A and B. It doesn't make a difference whether we switched an odd number of times or an even number of times or a random number of times, for instance, whether we kept tossing a coin and switching till we first saw "heads". What is going on here mathematically, is that B/A equals 2 with probability half, and it equals 1/2 with probability half. The mean value of B/A is therefore 5/4. The probability distribution of A/B is the same: it equals 2 with probability half, and it equals 1/2 with probability 1/2. So the mean value of A/B is also 5/4. This illustrates Jensen's inequality, that if g is a strictly convex function and X a non-degenerate random variable, then E( g(X) ) > g( E(X) ). Take g to be the function g(x)=1/x and X to be the random variable A/B. I am not a perpetuum mobile mathematician. We already understand why the TEP argument for switching breaks down, under all the ways in which it has so far been interpreted in the literature (context and intention of writer. We understand that from symmetry there is no reason to switch. You seem to be saying that you improve your position on switching once even though you don't look in Envelope A. I say that this is nonsense. If you don't look in Envelope A and switch once, you are in exactly the same position as if you had not switched at all. Richard Gill (talk) 09:58, 14 November 2011 (UTC)


 * Thank youso much. And I think it would not be wise to establish a gambling casino based on your words. Please tell me what is "wrong" here. If you let envelope "A" contain a cheque of "any" amount say from $1 to $ 1'000'000'000 at random, and in 1/2 of cases per random let the cheque in envelope "B" amount to 1/2 of A,  and in the second 1/2 of cases per random let the cheque in envelope "B" amount to twice of A, then in any ten thousend of games of one single Excel-list (100 x 100 games showing the contents of A as well the contents of B)  you just - in some empty field near the "totals of A and of B" - hitting the delete-button, then all 10'000 games are instantanuously "renewed". So pressing the "del"-button for say 100 times in sequence, you can control 1 million of games, at once. And the result will always be exactly the same. The sum of all "A" will always be 80 % (about 78,36 % to 80,43 %) of the sum of all "B" envelopes, meaning all "B" envelopes always contain 125 % (about 121,63 % to 127,61 %) of all "A" envelopes. That's the average, and that's a fact. On average you always are to win exactly twice the amount that you can loose. The total of all "B"s will always be 125 % of the total of all "A"s. So you don't even need to "open" the envelopes, you know the result in advance. So changing back from B to A will be on average 5:4. What about the casino offering the chance to double your every bet, with the risk to loose only 1/2 of the amount of your bet? Where is my error? Am really clueless. Thank you once more. Gerhardvalentin (talk) 11:48, 14 November 2011 (UTC)


 * You are now talking about a different problem. In your problem, call it Problem 1, Envelope A is first filled with some amount of money. Next a coin is tossed, independently of the amount of money in Envelope A, to decide whether Envelope B should contain half or double that amount. No symmetry. I was talking about the problem, call it Problem 2, where first of all two amounts of money are chosen, one is twice the other. These two amounts go into two envelopes. Next, and independently of these two amounts of money, a coin is tossed to decide which of the envelopes will be called Envelope A, and which is called Envelope B. Symmetry. The whole point of TEP is that in Problem 2, it is impossible, given what is in envelope A, for envelope B to be equally likely to contain half or double that amount, whatever it may be. (Except in the extreme case of an improper prior distribution when the smaller of the two amounts is equally likely to be any of the infinitely many powers of 2 (positive and negative together), which many people exclude as being ludicrous, and which in any case is outside of ordinary probability calculus). The difference between Problem 1 and 2 is patiently explained in many of the articles on TEP. The real TEP is Problem 2. The solution to the paradox is to realise that this is not the same as Problem 1. Richard Gill (talk) 13:03, 14 November 2011 (UTC)
 * Wow, you are right, Richard. Don't know what happened, but you are right again. Total of all A's and all B's suddenly is 1:1. Still clueless, but thank you! Gerhardvalentin (talk) 14:14, 14 November 2011 (UTC)
 * As a corollary to what Richard has said, I believe that it is important that solutions to the TEP (problem 2) should be demonstrated to fail for problem 1, where you should swap once. I think this test is currently under-applied. Martin Hogbin (talk) 10:28, 20 November 2011 (UTC)
 * This is indeed a useful diagnostic tool. Richard Gill (talk) 20:11, 20 November 2011 (UTC)

Trying to recruit an expert in statistics
Hi Richard. I'm involved in a content dispute about potential original research on Wikipedia. There is a sub-dispute about whether statistics methods have been applied correctly and the debate is going in circles. My feeling is that is we can settle the sub-dispute we could break this loop. It could really help us forward.

The dispute is probably trivial: It is about whether a median is a proper way to summarize multiple statistics collected from different sources, and whether the sources has been collected in such a way that it is proper to calculate and communicate the median.

The article/table in question is this one: OS market share. The issue is whether the median, given the different populations the sources have been drawn from and the fact that they have been selected by wp editors, is a proper way to "summarize" the data in the table.

The debate is currently taking place here: Mediation Cabal/Cases/13 November 2011/Usage share of operating systems. If you do decide to help us out I would appreciate that you introduce yourself as an expert on the subject (obviously) and also inform the the debaters there that you have entered on my request user:Useerup. I believe that a recognized expert on the subject really can help us break the impasse. --Useerup (talk) 18:01, 15 November 2011 (UTC)
 * Excuse my intrusion, I am not an expert in statistics but an interested amateur, however I would like to state a principle which is probably relevant to your case and it is that in disputes about statistics the problem is not usually with the answer but with the question. Once you can decide on exactly what it is you want to know, the rest should be a formality. Martin Hogbin (talk) 10:14, 20 November 2011 (UTC)
 * Exactly. Though whether the rest is a formality or not depends very much on the situation and the question. Richard Gill (talk) 20:10, 20 November 2011 (UTC)