User talk:Martin Hogbin/Two envelopes

Comments on solution
I do not believe that there are any simpler solutions than the one above. It is not good enough to say that A refers to two different things in the expectation formula. A refers to one thing, the sum in the original envelope.

The A's are fine it is the 1/2's that are the problem. They are conditional probabilities but we are using them as total probabilities.

Comments by Richard

 * Thanks for your comments Richard. I ask some questions below. Martin Hogbin (talk) 16:14, 22 October 2011 (UTC)

You are putting probability only into the random choice of which envelope becomes envelope A: the one with £2 or the one with £4. Falk and also many of the philosophers also take this point of view. It's certainly legitimate. I would recommend we use the symbol A to denote the random variable equal to 2 or 4 each with probability half. Possible values of this random variable, which I would denote a, are therefore 2 and 4. Switching to italic every time you have to type maths may be tiring, but it does help focus the mind.

With this background context there still seem to be two possible routes which the author of Step 1 to Step N might have in mind. Does he want to compute the expectation of B, or the conditional expectation of B given A=a? (with a=2, and a=4).

Let me denote these two values by as and al (s for smaller, and l for larger).


 * Why do we need to introduce this notation in my simple case. We surely know as=2 and al=4? Martin Hogbin (talk) 16:14, 22 October 2011 (UTC)


 * Sure. But there are various other uses of "2" and "4" in the derivations, so I find it useful to keep them distinguished! So *I* prefer to write with two arbitrary specific amounts. Sure, many readers might prefer to see things written out with particular specific amounts, not general specific amounts. As long as we remember that the guy making the argument is not supposed to know what those amounts are. Only the guy who put the money in the envelopes knows them.

Route 1 (unconditional). It's correct that E(B)=2 as P(B=al)+al/2 P(B=as) = as + al /4. On this reading, the writer is confusing as and al both with A. But I call this equivocation squared. The same symbol on his or her right hand side denotes not only two different things at the same time; the equalilty of left and right hand sides is identifying two different categories of things: the expectation value of a random variable on the left, and a random variable on the right! It's not just equivocation, it's stupidity.


 * Is that not the same as my R1 or maybe R2?


 * You should be able to tell me!


 * It looks to me that we agree completely here. If the writer is confusing A with 2 and 4 then this is an obvious stupidity which immediately resolves the paradox.  Unfortunately the writer is probably not being that stupid so this resolution is not that important.


 * The problem is that that seems to be to be exactly what is being proposed as a simple solution in the TEP article.

Route 2 (conditional). It's correct that E(B|A=as)=E(2A|A=as)=2 E(A|A=as). And that of course equals 2as. In the same way, E(B|A=al)=al/2. We may also (try to) compute the conditional expectation of B given A by splitting over the two events A < B and B < A. This gives E(B|A=as)=E(B|A < B, A=as)P(A < B | A=as) + E(B|B < A, A=as)P(B < A | A=as), and similarly with as replaced by al. From this we find E(B|A=as)= (2as) P(A < B | A=as) + (as/2) P(B < A | A=as) and E(B|A=al)= (2al) P(A < B | A=al) + (as/2) P(B < A | A=al). Probabilists are used to combining the two equations into one by use of standard shorthand E(B|A)=(2A) P(A < B|A)+ (A/2) P(B < A|A). The conditional expectation left and the two conditional probalities right are all functions of A, or if you prefer, of the values which A can take. On this reading, the writer apparently thinks that P(A < B| A=as) = P(A < B| A=al) =1/2 but actually the first conditional probability is zero, the second is one. Most people trained in (stochastic) mathematics think that the writer was taking the second route. It's a natural route to take, though it's more difficult. Hence it is easier to make a mistake. And it ties in with the story: we are supposed to discover that the expectation of B given A is larger than A, whatever value A might take, and hence that it turns out that we don't have to look in the envelope to decide whether or not it's beneficial to switch. That was the whole point, wasn't it? Also, this interpretation leads naturally to the Anna Karenina possiblity that we are simultaneously using probability to express our uncertainty about the amounts actually in the two envelopes. We don't know they are £2 and £4, right? You were standing on the wrong platform.


 * We do in my version of the problem. Martin Hogbin (talk) 16:46, 22 October 2011 (UTC)


 * You and I do know that, we're the guys who put the money in the envelopes! But our imaginary player - the guy who works through this argument - doesn't know. That's one of the assumptions of the problem, I think. (Sure it might help him test his argument by pretending that the only possiblities were 2 and 4. Once he has done this, he could try to think about the Bayesian version, by imagining that he knows for sure that the envelopes were filled as follows (one of the philosophers does it this way): the guy with the money first of all decides whether to start off with £2 or £4 (50/50 which it will be). Next he decides whether to halve this or double this (50/50, again). He ends up with £1 and £2, £2 and £4, or £4 and £8.

After that, the next Anna Karenina branching is whether we are using a proper prior or an improper prior. The originators of this problem were certainly using subjective probability and some of them were explicitly using improper priors. Two different platforms.

As iNic says: TEP is a paradox in subjective Bayesian decision theory. I agree that that is how it started and how much of the literature treats it. The philosophers and Falk are both in a minority and historically incorrect. Many of the philosophers have a lot of trouble with the mathematics. Falk is trying to defuse the paradox in an elementary educational context where one takes the absolutely least sophisticated platform. Fine, but that is not a solution for everyone. She makes no attempt to survey other solutions let alone arguing why hers should have primacy. She just wants to help out all those teachers out there who are happy with anything which is simple enough that they and their pupils both understand it. Richard Gill (talk) 14:02, 22 October 2011 (UTC)


 * My solution is not Falk's solution and I am not promoting Falk's solution.


 * Could you tell me please only for the specific question that I have asked here (that is with £2 and £4 in the envelopes) is my solution valid? If not, what are the loopholes? Martin Hogbin (talk) 16:27, 22 October 2011 (UTC)


 * Why can't you tell me if one of my solutions is identical to yours!? It's just a question of replacing al by 4 and as by 2 in everything which I wrote, and then comparing your and my analyses (I like to teach by having my students discover for themselves), sorry. And I do think that one of your solutions is Falk's solution. I have read just about all the papers and I don't know any other solutions than the ones I described for you! Yours and Falk's are the simplest in some respects and I think they're the same as far as the logic is concerned, the only difference is the language and the level of detail. Remember that on wikipedia we are not supposed to be putting forward our own novel solutions but merely reporting the solutions of authorities in the field. Preferably of authorities who overview the field, rather than those who claim to present something new. Richard Gill (talk) 17:52, 22 October 2011 (UTC)


 * By the way I already agreed with you that - standing on your platform - A in Step 1 is a random variable which can equal 2 or 4 with equal probability 1/2. And I told you that there are just two ways now to interpret what is going on in the key steps 6 and 7 (if I remember their numbering correctly). Falk and you both prefer to imagine that the writer is computing the unconditional expectation of B. But expressing it on the right hand side in terms of the random variable A rather in terms of the particular values which it can take (2 and 4). Equivocation, yes, and actually muddling up conceptual levels as well. The two A's on the right hand side equivocate with one another, but they also equivocate with the earlier meaning given to A, and they equivocate with what is on the left hand side. The left hand side (in this interpretation) is a number but the right hand side is a random variable. The writer is screwed up, squared. I prefer to imagine that the writer is only screwed up a little but is attempting a more sophisticated argument, namely to calculate E(B|A=a) both for a=2 and for a=4, and to show that in either case it's larger than a. Hence switch anyway; and also switch anyway, even if the amounts were anything else. In fact we could without loss of generality define the unit of currency to be half the smaller amount. Call that 1 Doubloon. Then the smaller amount is 2 Doubloons and the larger amount is 4 Doubloons. In fact, without loss of generality, we can indeed take, just like you did, the two amounts to be 2 and 4! We just remember that we don't know the currency unit. Richard Gill (talk) 18:08, 22 October 2011 (UTC)


 * Bravo Richard! You just made more progress towards a real solution than ever before. (Even if you didn't realize it.) This observation is brilliant and yet very elementary. Still I have't seen it in any discussion of TEP so far, which is quite strange. The currency can be an alien currency which means that seeing "2" or "4" or any other number of that currency doesn't give you any information at all of how 'much' it is. So where does this leave your Unified Solution argument? Do you realize that that solution doesn't survive this simple observation? iNic (talk) 14:13, 3 November 2011 (UTC)


 * My unified solution also applies to Martin's context. My solution is the mathematical fact that (unless we are working with improper priors): knowing whether we have the smaller or larger amount in our envelope, changes our beliefs as to what is in our envelope. And conversely, knowing what is in ours, changes the probability that it's the larger or the smaller. These are exactly the two facts which Martin exhibits to be true in his context (the two amounts are fixed, known in advance, 2 and 4 currency units respectively). He does it by explicit computation. I show that it is true in general. Translating his context into my language: X and Y are constant random variables, taking the values 2 and 4 respectivley. A is equally likely to equal 2 or 4 and simultaneously B is 4 or 2. My little theorem says that the random variable A is not independent of the event { A < B}. In words, the probability that A is 2 or 4 would change if we were told that A < B (or vice versa). Equivalently, the event {A < B} is not independent of the random variable A. In words, the probability that we have the smaller or the larger amount in our envelope would change if we were informed how much was in there. Depending on whether the writer was aiming at a conditional or an unconditional expectation value, one can use the non-independence in one of the two directions, to show that step 6 and or 7 is going wrong. So there is perfect concordance between Martin's simple solution, the observation that without loss of generality (in a non-Bayesian interpretation of the problem) we can take X=2 and Y=4, and my "unified solution". His explicit calculations illustrate, in his special context, my "unified solution". Richard Gill (talk) 09:56, 4 November 2011 (UTC)


 * OK I'll explain what I mean. If the envelopes contain money of a completely alien nature we still don't know how much we have in front of us when we open our selected envelope. We have a case of complete ignorance. Even if we open both envelopes we can't tell which contains the largest amounts of money. Your Unified Solution argument is based on the assumption that the values can be mapped onto the real line, and then you use some topological properties of the real line to prove your thing. But TEP is not confined to the topology of the real line. This is why your solution as well as a majority of all other proposed solutions are invalid. iNic (talk) 02:55, 5 November 2011 (UTC)


 * Dear iNic, you clearly didn't look at the proof of my "unified solution" nor my explanation of how it applies to two neckties, two-sided cards (Schrodinger), two-sided cards (Littlewood). The mathematical heart of that derivation is a proof that if random variables A, B have a joint probability distribution which is symmetric under exchange of the two variables (this is called exchangeability) and if they have positive probability to be different, then the event {A < B} (more precisely: the three events {A < B}, {A = B} and {B < A}) cannot be independent of the random variable A. The only restriction on the outcome space of A and B is that the outcomes can be ordered and can be mapped one-to-one and order-preserving to the real line. Here's the proof once more. Without loss of generality we can map A and B one-to-one and order preserving to a bounded interval of the real line. Note that statistical independence of the event {A < B} and the random variable A is conserved under 1-1 monotone transformations. From now on I'll write A and B for these transformed variables. When I have obtained the desired result for the transformed variables, it also holds for the original, by the remark just made. Now E( A-B | A > B ) is well defined and positive by the assumptions we have made (bounded range and positive probability to differ). It follows that E( A | A > B) > E( B | A > B) (here again bounded range is used to exclude the possibility both terms are infinite or undefined). By symmetry the last expression equals E( A | B > A). Thus E( A | A > B) > E( A | B > A). This implies that A is not independent of the events {A > B}, {A = B}, {A < B}. QED. Richard Gill (talk) 15:15, 18 November 2011 (UTC)


 * Dear Richard, this is exactly my point. You admit that your calculations are restricted to the outcome space where "the outcomes can be ordered and can be mapped one-to-one and order-preserving to the real line." However, it is easy to see that the TEP argument is not confined to those restrictions. Without realizing it you found the maybe simplest argument for this yourself: what if the money in the envelopes are alien money? In this case the TEP reasoning works exactly as before (and of course is still as invalid as before) but your calculations are invalid because the alien money can't be mapped one-to-one, order preserving, to the real line. Can you see that your argument fails now? Moreover, most of the calculations in the TEP literature rests on this very same assumption about the real line, and are thus invalid for the same reason as your solution. With just a little imagination it is easy to put the TEP argument into other contexts that shows the same thing. Strangely enough this is OR as no one else has realized this very simple fact yet. So we can't state this in the article. iNic (talk) 17:45, 19 November 2011 (UTC)


 * Nobody has proposed a version or variant of TEP where outcomes cannot be ordered and valued. So it seems to me, iNic, that you are concerned with the fact that so far no one has yet provided a solution for a paradox which so far no one has formulated.


 * Well, you provided this version yourself above. Your brilliant observation was that the money given could be in a totally unknown currency. You even coined (!) the currency Doubloons. I don't know about you but to me that is an alien currency. So let's say you find 4 Doubloons in the first envelope you open. Now you know that the other envelope must contain either 2 or 8 Doubloons. Given that you want as much Doubloons as possible it seems reasonable to take the other envelope. And so on... iNic (talk) 16:21, 21 November 2011 (UTC)


 * If you do know a fun new paradox then you should publish it and I hope it gets popular too.


 * You have one new below without money. I will construct more versions as we go along. iNic (talk) 16:21, 21 November 2011 (UTC)


 * I have been thinking of the version of the TEP in which I meet an alien who shows me two closed envelopes containing amounts of alien money, one twice the other. The paradoxical argument can equally well be run through with "dollars" replaced everywhere by "alien currency units". And the solutions for the situation with a proper prior distribution do not depend on the name or the value of the currency unit. Or are you trying to say that if I really do meet an alien I will have totally no idea about the value of what is in either envelope, even if I did look in both?


 * It's easy to imagine a spectrum of cases where you have situations where you learn more and more about the alien currency. A realistic scenario is a child that is abroad with her parents and gradually gets more and more feeling for how much the alien currency is worth. Some things are much cheaper than at home while other things are much more expensive, so it's impossible to create an order preserving one-to-one mapping between the currencies based on experience alone. However, her knowledge increases over time even if it never becomes perfect. Every day we give her the opportunity to get money from one of two envelopes to spend. One containing twice the other. iNic (talk) 16:21, 21 November 2011 (UTC)


 * I would frame this problem in terms of expected utility. The actual utility of the actual amount in either envelope would not be known to me even if I look in both envelopes. Since it is not known I have to use my prior beliefs to come up with an expected utility. Since I have no idea what either envelope is worth, even if I look in both, by symmetry, my expected utility for getting either envelope is the same. Thus framed in terms of expected utility, the paradoxical argument cannot even be written down. Therefore there is no problem, hence no paradox, and hence no solution. We do not need solutions to non-existent problems.


 * In the case with the girl abroad you need to use the "zero knowledge" strategy to solve the problem in the beginning of her stay abroad, but at some point you have to switch over to your unified solution strategy. So you have another continuum problem around your neck here. When is her knowledge about the alien currency so good that you need to invoke the Unified Solution to avoid TEP? BTW, how do we measure "knowledge"? iNic (talk) 16:21, 21 November 2011 (UTC)


 * How do we measure "knowledge"? In decision theory it is measured through utility. We suppose that people behave self-consistently in their choices and decisions. Then it turns out that they behave *as if* they are making decisions according to computations of expected utility according to prior distributions and utility functions. And as they gain new knowledge their prior distributions are updated as if by Bayes theorem. Utility is measured by comparison with some yardstick e.g. dollars. Would you exchange this opportunity to eat that bowl of fish soup for so many dollars? I did not invent this theory and I make no claims as to whether it is useful or not. I am also not an expert in it . But I do believe that your problems have been adressed before. Richard Gill (talk) 17:36, 25 November 2011 (UTC)


 * OK so you don't really believe 100% in utility theory? Well, you for sure believe in utility theory as soon as your ordinary mathematics breaks down, because then you rely 100% on utility theory. If it is true that it's so easy to model human behavior with utility theory as you say, how can utility theory model your current behavior and the fact that you still haven't decided if you want to keep the fish soup or go for the other hidden dish instead? The soup is starting to get quite cold now, so you better hurry up! Also, please explain how your prior distribution was updated (as if Bayes theorem was used) when you learned that the dish you picked was fish soup. I really want to know this. And I also want to know the bayesian updated utility for the other hidden dish, and how that utility is properly computed. It is OK if you assume that you are self-consistent in this case. iNic (talk) 01:35, 13 December 2011 (UTC)


 * PS there is no "continuum problem" here. As information grows expected utility changes. At each step in the "girl abroad" problem her expected utilities of the two alternatives are symmetrically distributed so my "unified solution" tells us what goes wrong in the TEP argument. But anyway we know that in a situation of complete symmetry there is no point in switching anyway. So there simply isn't a new problem in this "girl abroad" scenario.


 * OK let's say that she sees 512 monetary units in envelope A. Is this the case when she knows nothing and the symmetry argument need to be applied directly to solve the paradox, or is this the case when she in fact has gotten some information and your unified solution need to be invoked to solve the paradox? In short, is the number 512 a piece of information or not? iNic (talk) 01:35, 13 December 2011 (UTC)


 * I would definitely call the non-existence of the two alien envelopes problem a problem of philosophy, not of probability theory or decision theory. Richard Gill (talk) 00:42, 20 November 2011 (UTC)


 * In my book your solutions are firmly within philosophy, with substantial ingredients from both economic utility theory and psychology. iNic (talk) 16:21, 21 November 2011 (UTC)


 * PS, this also brings out a nice point concerning decision theory. *Absolute* utility does not need to exist, we only need to be able to talk about *relative* *expected* utility - relative to any arbitrary particular outcome. So to solve the unsolvable non-problem I do not need to hurt my head trying to figure out how much it would be worth to me to walk away with an envelope containing an unknown amount of alien currency of unknown value (or toxicity to humans, for that matter). I just need to know the difference between the expected utility of the two outcomes. By symmetry of my prior beliefs, the difference in expected utility between the two outcomes is zero. Richard Gill (talk) 00:52, 20 November 2011 (UTC)


 * OK here is another context for TEP. Let's say you are presented with two hidden dishes. You are being told that one dish is "twice" as good as the other. You point at one of them at random. The dish you picked turns out to be fish soup. Now you get the opportunity to take the other dish instead. All you know is that the other dish is either twice as good as the fish soup or half as good. What will you do and why? iNic (talk) 04:50, 20 November 2011 (UTC)


 * No one can know for me, that one dish is twice as good or half as good as another. This is a non-problem. The corresponding version of TEP: you have got 2 dollars and you can exchange it if you like, for equal chances of 1 and 4 dollars, is already a non-problem. Every person will solve this their own way depending on their personal utilities of 1, 2 and 4 dollars, and on their psychology (some people want to avoid the worst, some people are happy to maximize expectation values). Richard Gill (talk) 18:59, 25 November 2011 (UTC)


 * As usual, the "double or half" requirement can be dropped. In the case with envelopes and money any two different amounts of money will create the paradox. The same applies here. Any two dishes that you are not indifferent to is enough. We don't even need to know in advance which dish you prefer, only that you prefer one of them. So all the chef has to do is to make sure that you will like one of the dishes more than the other. This situation is fore sure possible to arrange. If you really think your dollar problem is a good analogy you have already answered the question: you should switch and take the other dish. Decision theory says clearly that you should chose the equal chances for 1 or 4 dollars over the sure 2 dollars. This is very basic and if you don't accept this fundamental principle I don't understand why you find TEP interesting in the first place. iNic (talk) 15:28, 6 December 2011 (UTC)

Thanks again for your comments. What I am trying to do is to find a way to bridge the gap between the expert mathematicians and the general public. I want to try to find a simple and clear resolution of the paradox that is mathematically and philosophically sound. The closest we have to this at present is Falk but I do not think she completely succeeds (and the version of her explanation proposed for the article is even worse) and I think we could do better

This may be OR but it is in my user space, for the moment at least. If you are interested we could discuss the problem by private email but I quite like the wiki format for this kind of discussion.

There some principles that I would like to employ to try to resolve the paradox simply. These are:

Start with the simplest possible version of the problem
This is not intended to be the usual version of the problem it is just a starting point for discussion. Once the resolution of this problem is clear, I think it would be quite easy to extend it to other versions. As you point out, we can start with currency conversion.

Remove complicating philosophical discussions about the meaning of probability
I would, as far as possible, like to start with a version where the frequentist and Bayesian interpretations of the problem will be the same. Also I would like to remove utility/risk arguments from the discussion as well.

Make the resolution closely match the proposed line of reasoning
The objective is to find the error in the proposed line of reasoning. This objective is most obviously achieved if we stick closely to the original line of reasoning, analysing it step by step. Thus, for example, we do not consider exactly what expectation in the other envelope we are talking about until step 7.

Use numbers to represent possible values of A
This makes the distinction between A and a much clearer to most people. Everybody knows that 2 is never equal to 4. Martin Hogbin (talk) 11:51, 23 October 2011 (UTC)


 * Your principles are fine, I think, for an encylopedic article which should be accessible to a big audience. And as far as I am concerned you are reproducing the solution of Falk's solution, Schwitzgebel and Dever', and quite a few others, even if you use rather different words and don't recognise the fact that your solution is their solution. You are making exactly the same interpretation, you see, so there's no way your analysis of where the argument goes wrong can be different from theirs! (assuming that their analysis is not *wrong*). But no one has published a paper showing that these people were wrong.


 * A philosophical discusion about probability is going to have to come sooner or later, since many solutions imagine that the writer is using subjective Bayesian reasoning. And historically this was how the problem was born, quite explicitly. Also sooner or later you're going to have to work out the two possible choices for expectation or conditional expectation. Since there are many solutions which are based on both of these options. Richard Gill (talk) 17:21, 25 October 2011 (UTC)


 * You say: "I want to try to find a simple and clear resolution of the paradox that is mathematically and philosophically sound". You'll only find a resolution which is conditional on making a certain interpretation of the intention of the writer. Lots of readers will automatically make different interpretations from you. So you have to make this point (the Anna Karenina) point clear early on, otherwise you will alienate readers who have a different background from you, and who automatically make different background assumptions. They'll say: "this solution is stupid because the writer is obviously computing a conditional expectation with respect to his prior subjective beliefs about the amounts ...". Historically they would be absolutely correct. Richard Gill (talk) 17:27, 25 October 2011 (UTC)
 * In my example the subjective beliefs are intentionally the same as the objective facts. The player and the reader know that there is £2 or £4 with equal probability in each envelope.


 * I am not suggesting that this is in any way a complete solution to all variations of the TEP but I do think it might be a logical place to start. We can then branch out into other versions from a common base.


 * In your paper you make much of the difference between E(B) which I take to be the expectation value of the unchosen envelope given no conditions, and E(B|A), which is not so clear to me. In my example, I assume E(B) must be £3.  What exactly does E(B|A) mean and what is its correct value in my version? Martin Hogbin (talk) 18:30, 26 October 2011 (UTC)


 * E(B|A) is the expectation value of what is in the second envelope given what is in the first, thought of as a random variable, in fact, a function of the random amount that is in the first envelope. In your situation, A=2 or 4 with equal probability 1/2, and B is 4 or 2 on the same events. Thus if A is known, B is known too. Therefore E(B|A)=B because B is a deterministic function of A. Going into details one could first compute E(B|A=2)=4, E(B|A=4)=2. The by definition, E(B|A) is the random variable which takes on the value 4 if and only if A takes on the value 2 and takes on the value 2 if and only if A takes the value 4. It is for instance correct to write E(B|A)=6-A. It is also correct to write E(B|A)=4 I(A=2) + 2 I(A=4) where I(an event) denotes the random variable which equals 1 if the event is true, 0 if it is false. When you ask "what is the correct value of E(B|A)?" you are asking a question which shows that you don't understand the notation. E(B|A) is a random variable, not a fixed number. It is moreover a function of A. Just like A-squared or log(A), for instance. In this particular situation, B is a deterministic function of A. So the expectation of B given A is B itself, since B is fixed given A. So we can also write, entirely correctly, E(B|A)=B. Richard Gill (talk) 13:22, 29 October 2011 (UTC)


 * PS you don't assume E(B)=3, you derive or prove this result. Similarly we don't assume E(B|A)=B=6-A, we derive or prove these statements. We can do that in many ways. For instance, A+B=6 with probability one. Therefore E(A)+E(B)=E(A+B)=6. By symmetry E(A)=E(B). Therefore E(B)=6/2=3. Moreover, B=6-A therefore E(B|A)=E(6-A|A)=6-A=B. Richard Gill (talk) 13:26, 29 October 2011 (UTC)


 * PPS equality of two random variables means that they are always equal (or possibly: equal with probability one, unequal with probability zero). The equality of a random variable and a constant means that the random variable is constant with probability one. For instance A+B=6. When we write things like A=2 or A=4 we are refering to events and talk about the probability that A=2; or the random variable I(A=2). Why don't you read my essay on probability notation ? It seems to me that collaborative editing on a topic from probability theory requires familiarity by all of elementary probability calculus and standard elementary probability notation. Richard Gill (talk) 13:31, 29 October 2011 (UTC)

Rereading this whole page, I like your analysis very much. You create the most simple possible context for the problem, and show that within that context the meaning of A is unambivalent. And then you show that by keeping track of the meaning of A we can put our finger on the error. If the probabilities 1/2and 1/2 are correct in step 7, then writing 2A and A/2 is an equivication squared (it is a confusion of levels as well as of names). If the values 2A and A/2 are right then the probabilities 1/2 and 1/2 are wrong, they should be 1 and 0 or 0 and 1. These two cases correspond exactly to my unconditional versus conditional expectation split. In your context you think the second case is less plaudible (more stupid) but I believe that when we move to the situation where the two amounts are unknown and probability also refers to our subjective beliefs about the two unknown amounts of money, the conditional interpretation becomes less stupid. And in the extreme case that we are so uncertain about x, the smaller of the two amounts, that we think it equally likely that it's ... 1/4, 1/2, 1, 2, 4, 8, .. pounds, then step 7 is actually correct (conditional interpretation):  if we happened.to see 8 pounds in envelop1, envelope 2 would be equally likely to contain 4 or 16 pounds. And our expectation value of what's in envelope 2 would be (5/4)*8 which is more than 8. And since the same would be true whatever we saw in Envelope 1, there is no point in looking in it: switching increases our expected wealth by 5/4. This appears paradoxical but isn't since our expected wealth is infinite anyway. Like it or not, this is where TEP came from - from the 40's and 50's of the last century when subjective Bayesian reasoning with improper proofs (meant to correspond to colossal ignorance) was common, and uncontroversial, in various scientific fields. The prior in question is called the Jeffrey's prior. Standard textbooks on Bayesian statistics will still recommend its use. Richard Gill (talk) 16:29, 1 November 2011 (UTC)

Your solution would indeed be a very good way to start. You have taken the absolutely most simple *context* for the problem. Note that even in this most simple context, there are two ways to interpret the *intention* of the writer: is the expectation value supposed to be conditional or unconditional? It doesn't matter, both ways he is making silly mistakes. But already this context exhibits an Anna Karenina bifurcation: not only were the assumptions of the writer not made explicit, his intention was not made explicit either. The next most simple context would be to suppose a prior distribution over the smaller of the two amounts. You could first do this with some easy, concrete, example. For instance: suppose you know that the envelopes are filled as follows: two pounds or four pounds are put, with equal chance, in one envelope. Next, this amount is halved or doubled, with equal chance, and put in another envelope. Now the two envelopes are shuffled and you get to pick one. This specific example is used several times in the literature, hence it certainly does not constitute "own research" to be using it. In this example, the smaller amount can therefore be 1, 2 or 4 pounds and the larger correspondingly is 2, 4, or 8. The chances of these three possibilities are 1/4, 1/2 and 1/4. The possible values of A are 1, 2, 4, 8 and the chances of those values are 1/8, 3/8, 3/8, 1/8. I leave it to you to compute E(B) (that's just one number) and E(B|A) (that's a random variable and simultaneously a function of A). The calculations are simple and again we see that whether the writer intended to go for a conditional or an unconditional expectation value, he went wrong. After this you could show easily how any prior distribution with bounded support fails, on either interpretation of the intention. At this point you probably have done enough for the average reader, but for the academic readers you'll have to bring in the situation of an improper prior. I would do this as follows, using a "finite" approximation to the appropriate improper prior. Suppose the amounts of money are generated as follows. One of the squares of a chess board is chosen, all with equal probability. The squares are numbered from 1 to 64. The two amounts of money are 2 to the power of the chosen number, and twice that amount. So the smaller amount can be 1, 2, 4, ... 2 to the power 64 currency units, each with equal probability. Now do the calculations. Unless envelope A contains 1 currency unit or 2 to the power 65 currency units, the other envelope *is* equally likely indeed to contain half or twice that amount. So E(B|A)=5A/4 unless A=1 in which case E(B|A)=2, or unless A= 2 to the power 65 in which case E(B|A)= 2 to the power 64. It's almost always to your advantage to switch. Would it be feasible for a casino to run this game? What should a player play in order to participate in the game? Should they pay extra to have the opportunity to look in their envelope? This example to can be found in the literature. It's a variant of the famous Saint Petersburg paradox and many writers have noted that in this context the resolution of the paradox is the same as the resolution of Saint Petersburg: you are always dissapointed when your expectation value is infinite. Richard Gill (talk) 08:03, 5 November 2011 (UTC)
 * Thanks for your comments Richard. What you have started was roughly what I had in mind when a talked about expanding this version to more complex ones.


 * I think that this approach also shows that there are no really simple or obvious solutions, even for this simplified version of the problem. This is important for the article for is shows that the claimed 'simplest' solutions are incomplete.  Falk may well have know what she meant in her 'simple' solution but she doses not make it very clear.  I think the versions of her solution proposed for the article have been simplified to a degree where they are meaningless.  Martin Hogbin (talk) 16:00, 6 November 2011 (UTC)


 * I tend to agree with you Martin. I think that the two main solutions - are we trying to show E(B) > E(A), or are we trying to show E(B|A=a) > a for all a? - are pretty much equally complicated. Also I think that whether we are working with fixed amounts of money in the two envelopes, or a probability distribution over these amounts representing our prior beliefs about what they could be - the two main solutions remain pretty much the same and remain pretty much equally complicated. And as I've said they all revolve around the same two, complementary, facts: proper prior beliefs about A would necessarily change on being informed of the ordering of A and B; and prior beliefs about the ordering of A and B would necessarily change on being informed of the value of A, at least for some of those possible values. Richard Gill (talk) 00:20, 20 November 2011 (UTC)

The TEP and Moon landing conspiracy theories
The paradox resolution given above works for the simple case where there are two known fixed sums in the envelopes. It would seem that it is quite easy to extend it to two arbitrary finite sums by a simple scaling. How then do strange distributions attempt to circumvent the solution given above and thus re-create the paradox?

It is up to the resolver of the paradox to show exactly where the error (or errors) lies in the initially proposed line of reasoning. That is what I hope is done above.

Anyone who hopes to re-create the paradox now has to show how they would circumvent the error, just as those who claim the moon landings were a hoax have to explain exactly what they think was, in fact, done. I have never yet seen a cogent and complete explanation of how the claimed hoax was perpetrated. Those that try usually have obvious errors in their stories.

I think the situation will be the same with the TEP. We agree that the resolution above works for two fixed sums in the envelopes. How then does the use of the chess board distribution get round the resolution above? It could, for example, claim to circumvent step 7:

''7 Is a failed attempt to calculate an expectation value because it neglects to take into account the dependence of the probabilities on the value A turns out to have. We need to calculate 2 * P(A=2) + 4 * P(A=4) but we do not actually do this. In the simple version of the problem there is no way round this error thus the paradox is resolved here if it has not already been''

The chess board (or its bigger brother) distribution is such that nearly always the probability that the original envelope is the smaller will not depend on its value. But 'nearly always' is not 'always' and when it goes wrong it goes really wrong (just like a Martingale (betting system), not the probability theory Martingale}. This swings things back in favour of not swapping and thus the paradox is again resolved. Martin Hogbin (talk) 23:03, 20 November 2011 (UTC)


 * The paradox resolution you have offered for the special case of a fixed smaller amount of money essentially works for all cases! Nothing is essentially changed by going to strange distributions! Only at the extremely extremely strange case of supposing that *all* integer powers of 2 pounds are *all* exactly equally likely does step 7 become correct. But this case is so strange that most writers discard it as ludicrous. Nobody would seriously believe that. So we may discard that case.


 * But do please note that there are two interpretations of Step 7. It may be a failed attempt to show E(B) > E(A); or it may be a failed attempt to show E(B|A=a) > a for all possible a hence there is no need to look in the envelope in order to decide whether or not to switch. Going back to the history of TEP the second interpretation was the intended one. Only at some stage people started talking about TEP who didn't understand what they were talking about and the first interpretation started being proposed. Please, Martin, try to get your head around interpretation 2. Interpretation 1 is silly. The writer is not only equivocating between things but also between categories. He surely is not quite so stupid, even though many philosophers seem to think so. In both interpretations the writer is calculating the left hand side of the desired inequality by splitting the expectation over two cases: A < B, and A > B. So in both interpretations, two probabiilties and needed, and two values are needed. In the first interpretation we need the two probabilities of A < B and of A > B (1/2, 1/2) and the two values we need are the conditional expectation values of B in each of those two situations. In the second interpretation we need the two conditional probabilities of A < B and of A > B both given that A=a, and we need the two values of B given each of those situations *and* given A=a (and these values are 2a and a/2). In one interpretation the probabilities are obvious, the values are tricky; in the second interpretation the values are obvious, the probabilities are tricky. In the first interpretation the writer imagines that the conditional expectation of B in those two cases is the same as its unconditional expectation. But in the first situation intuitively speaking it should be larger, in the second situation intuitively speaking it should be smaller! In the second interpretation the writer imagines that the conditional probabilities that A < B and that A > B given that A=a are both equal to 1/2 whatever the value of a. Intuitively speaking, this can't be the case either. Intuititely, the larger a the more likely it must be that B < A given that A=a. Indeed, what I said intuitively must be true, is indeed true, except for some subtleties concerning infinities, which from a practical point of view can be ruled out, and from a theoretical point of view can be got around too. In other words these subtleties are only interesting for theoreticians and do not actually change the conclusions. In both interpretations, as long as we are using proper probability distributions for the amounts of money concerned, what the writer is imagining is wrong. NB: the special case that the two amounts of money are fixed at 2 and 4 pounds is a special case of a proper probability distribution: the probability distribution of the smaller amount of money is "degenerate", it is constant and known, equal to 2. That is still a probability distribution. For instance, with the chess-board distribution, the conditional probabilities that A < B and that A > B given A=a do depend on a. They are equal to 1/2 and 1/2 for many values of a but *not* for two values of a, the smallest and the largest. And again with this distribution, the expectation values of B given that A < B and given that A > B are easily seen to be different from one another and from the unconditional expectation value of B. Just as you would expect, E(B | A < B ) > E(B) > E(B | A > B). Exercise for the reader: compute these three numbers.

Exercise
I do not know how to do maths notation on WP but have calculated these three values for some different cases: For the distribution {1,2}  E(B | A < B ) = 2,  E(B) = 1.5,  E(B | A > B) = 1

As the distribution gets longer E(B | A < B ) is always twice E(B | A > B) but E(B | A < B) converges on the value of E(B)

Back to Richard

 * It's a simple mathematical theorem that this strict inequality is always true if E(B) is finite, while if E(B) is infinite, expectations are disappointing anyway, and no guide to decision. In any case, the conditional probability distribution of B given A < B, and the conditional distribution of B given A > B, and the unconditional probability distribution of B, are all different, and are all ordered in a natural probabilistic way: for some b we have the strict inequalities Prob(B < b | A < B) < Prob(B < b) < Prob(B < b | A > B) and for all b the inequalities hold though maybe not strict.  Thus when A < B, B tends to be larger, when A > B, B tends to be smaller, than without any condition. Gong back to your remarks on the chess board problem, nearly always is indeed not always, and when it goes wrong it goes really really wrong. And this is a general result. For the experts one can add some further discussion of the case of infinite expectation values, or, even more extreme, of improper priors (which so far I excluded, but which actually in the history of TEP is how it started). When you look at the history of TEP you will see that all the authorities are happy to say that infinite expectation value is not realistic and that improper distributions are ludiculous (I refer here in particular to Schrödinger and to Littlewood - both very very famous guys, and they were not wrong in this particular case). Everything I have said can be found in the literature, no Own Research is needed. All that is needed is a good overview so that one can write a coherent article. Remember: TEP was invented by highly skilled mathematicians in order to tease non-mathematicians! They succeeded very well, it seems, in fooling generations of philosophers and very many amateurs including many editors of wikipedia. Sad. This is why TEP is annoying. It is written in order to confuse, not in order to educate. Quite different from MHP which is a *wonderful* paradox. You can learn so much from it. From TEP you can learn nothing. Except that you ought to be careful to define what you are talking about and to specify what you are trying to do and what rules you are using, so as to avoid making beginners' mistakes. Since whether we are talking about interpretation 1 or interpretation 2, we are talking about a simple beginners' mistake. A mistake which would be avoided if the beginner would only take care to distinguish between random variables and values they can take. Which would be easy if only he'd follow the standard advice of using capitals versus small letters. It's all so silly. These are the truly stupid mistakes which we patiently have to get our students to stop making in the first couple of weeks of the first probability course. MHP is much more full of content, real interesting content, thought provoking content. Richard Gill (talk) 20:35, 21 November 2011 (UTC)


 * It's funny that we have totally different views here. In my view MHP is redicoulously simple and doesn't teach us anything we didn't already knew, while TEP is really deep and thought provoking. Its true solution will have far reaching consequences. iNic (talk) 10:49, 28 November 2011 (UTC)
 * I agree with Richard here. The TEP is a conjuring trick, an attempt to befuddle the mind with non-sequiturs.  We have to try to guess what the other person means before we can tell them what is wrong with it. Martin Hogbin (talk) 19:06, 28 November 2011 (UTC)

Richard, in my resolution I currently say:

When we come up with the figure of 1/2, we can be talking about one of two things:

6a The conditional probability that if A is constrained to be £2 then B (a random variable representing the sum the in the other envelope) will be £4 and if A is constrained to be £4 then B will be £2.

''6b The unconditional probability that, whatever the value of A, B may be twice or half its value. This statement is clearly untrue, thus, if the argument is understood this way, the paradox is immediately resolved. Statement 3 is wrong. (R4)''

This step is correct so long as we recognise that the two probabilities of 1/2 are conditional probabilities, and the condition is different in each case.

As far as I can see, 6b is pretty close to your second interpretation of what the question is supposed to mean. Could it be reworded to cover your interpretation 2 properly. For example:

''6b The probability that, for every possible value of A, B may be twice or half its value. This statement is clearly untrue, thus, if the argument is understood this way, the paradox is immediately resolved. Statement 3 is wrong.''? Martin Hogbin (talk) 18:54, 22 November 2011 (UTC)


 * Dear Martin, I think your wordings are confused; 6a and 6b, to my mind, mean the same.


 * On second thoughts, 6b is meaningless. Unconditional probability is irrelevant here, the word probability is inappropriate. Either it is true that whatever A could be, B could be half or twice that amount. Or it is not true. And whether or not it is true is not particularly relevant to any interpretation of TEP. Richard Gill (talk) 23:52, 23 November 2011 (UTC)
 * Yes, I understand and agree. I expressed it wrongly.  Remember Anna Karenina though.  We are trying to find the possible fault in someone else's muddled thinking.


 * So you do want 6b and 6a to mean the same thing? You wrote 6b The probability that, for every possible value of A, B may be twice or half its value. My reaction to that is that your sentence is possibly ambiguous because you are not distinguishing between random variables and possible values thereof. And from the point of view of conventional probability theory, I can only make sense of your sentence by adding a few words:  6b The conditional probability that, for every possible value a of A, B may be twice or half that value. Richard Gill (talk) 18:53, 24 November 2011 (UTC)

Interpretations of statement 6

 * I should have said that statement 6 could be taken to mean, 'For every possible value of A the probability that the other envelope contains 2A is 1/2'. Martin Hogbin (talk) 10:44, 24 November 2011 (UTC)


 * So you mean: for every possible value a of A, the conditional probability, given A=a, that the other envelope contains 2a, is 1/2. Note explicit distinction between random variables and values thereof, and explicit distinction between unconditional and conditional probabilities. Richard Gill (talk) 18:56, 24 November 2011 (UTC)


 * Yes, exactly that. As you say below, for all a, Prob(A < B | A=a ) =1/2.  This I think is one likely, but maybe unwitting, interpretation of statement 6 by the layman.  Would you agree that, in natural language, it could be expressed as, 'For every possible sum of money that might be in the first envelope, the probability that the other envelope will contain twice as much is 1/2'?  Martin Hogbin (talk) 20:49, 25 November 2011 (UTC)


 * Exactly: For every possible sum of money that might be in the first envelope, the probability that the other envelope will contain twice as much is 1/2 (the word "conditional" is implicitly attached to "probability"). You say that this is a likely but maybe unwitting interpretation by a layman. I am pretty sure it was the intended interpretation by the writer! Look at the history of the paradox! Richard Gill (talk) 08:27, 30 November 2011 (UTC)


 * But you want them to mean two different things. You must learn to distinguish the two statements


 * Prob(A < B) = Prob(B < A) =1/2


 * and


 * for all a, Prob(A < B | A=a ) = Prob(A > B | A=a) = 1/2


 * At the moment both 6a and 6b seem to me to be attempts to make the second of these two claims. (BTW, if you don't like my formulas, just read in words "the probability that" or "the conditional probability that ... given ..." and note the distinction between random variables (capital A, B), and possible values thereof (a, b) . Probabilities and conditional probabilities are properties of events, and events are described in terms of relationships between random variables and/or numbers. Richard Gill (talk) 17:12, 23 November 2011 (UTC)


 * Now the first of my two statements is, clearly, true. The second is, clearly, not true in your toy example where a can only be 2 or 4 and "Prob" refers only to the chance involved in the choice of envelope (not also to uncertainty as to possible contents of the pair of envelopes). In the reliable sources you will see many writers showing that the second is not true for particular choices of prior probabilities of possible (smaller) amounts. For instance: suppose all values between 1 and 100 are equally likely. My little theorem, "the unified solution", shows that for any proper prior distribution, the second statement is untrue. And my theorem also covers your example since the case when we know that the smaller amount is 2 is just one particular proper prior distribution. My "unified solution" also shows a second fact: for any proper prior distribution, the probability distribution of A given that B < A is different both from the unconditional distribution and from the distribution given B > A. In fact these distributions are strictly stochastically ordered (I expect there is a wikipedia article on stochastic ordering): there always exists an a such that Prob( A > a | A < B ) < Prob( A > a ) < Prob( A > a | A > B ). And for all a these three probabilities are ordered in the same way, except not necessarily strictly. It is possible for some a that we have equality. This says that when we know the ordering of A and B, it makes A, as you would expect, tend to be larger or smaller, in probabilistic terms, than if you don't know the ordering. This deduction by the way is completely new - I suddenly realised that it came out of the proof of my "unified solution" in discussions with iNic. We need your 6a and 6b = the second of my two statements to defuse the paradox when we are supposing that the author is trying to show that E(B | A=a) >a for all a. We need the other result I have told you about, the fact that the distribution of A differs in the obvious way (giving more weight to smaller or larger values) when we known that A is larger than or know that it is smaller than B, when we are trying to defuse the paradox in the case that we suppose that the author is trying to show that E(B) > E(A). My unified solution applies to your case (amounts in two envelopes known in advance) and to the case of a proper prior distribution of possible values. This just leaves the possibility that the author was actually an improper subjective Bayesian whose uncertainty about the smaller amount of money x is so collosal that his prior beliefs about x can be described by giving log(x) a uniform distribution over the whole real line. Schrödinger and Littlewood both said that this is obviously completely ludicrous. But anyway for those that do want to make this assumption, which takes one outside of conventional probability theory, the solution is now that this level of uncertainty makes your expectation value infinite (or zero or undefined) and hence you are disappointed by any finite value. Expectation values are no use for determining behaviour. PS. The special proper prior distributions which do make E(B | A=a) >a for all a necessarily make E(A) and E(B) infinite and can be dealt with in the same way as the improper uniform prior of log( x ). The alternative way to deal with prior distributions with infinite support (no finite upper bound) is to realise that the utility of money decreases as we get more and more of it, finally stablizing at the finite utility of having an infinite amount of money. Instead of working with the amounts of money A and B we should work with the corresponding pair of utilities thereof, C and D. This pair has a symmetric joint distribution and finite expectation values, so my unified solution applies directly to it (the unified solution does make any use of the restriction that A and B differ by a fixed factor 2. This is just a diversionary tactic of the inventor of TEP. Just like in MHP talking about "say door 1" and "say door 2" is a diversionary tactic to mislead your victim).  Richard Gill (talk) 17:38, 23 November 2011 (UTC)

Discussion with Gerhard
Martin, as Richard reiterates, I am no mathematician. Just let me say that reality shows that neither step 1, nor steps 1 to 6, nor step 7 ever are correct "in general". Steps 1 to 7 do not represent "reality". Proven millionfold. Even step 1 never is correct in reality, it is a pure "lie". As well as steps 2 and so on. But steps 1 to 6 and even step 7 are fully correct if you take envelope A to contain any determining amount that forces envelope B to contain either double or half the amount of envelope A with the same probability. Once more: only if you intentionally choose this envelope knowingly containing the determining amount, in every game. Then even step 7 is absolutely correct and by swapping to envelope B you get 1 1/4 the contents of envelope A. Proven millionfold, also. But not knowing about "determining" and "dependent" in choosing your envelope, i.e. with symmetry, the proportion scale of all  "A's" to "B's"  is exactly 1:1. So, if there is symmetry, e.g. if you should not know which envelope contains the determining mount, all steps from step 1 on are "pure lies". Since the reasoning  "E(B) = (2A + A/2) / 2 = 1 1/4 A"  carelessly does not regard the changing size of the initial value of A, and while that "mistake" is not resolved, I say that it can only be allowed if, at the same time, it is crosschecked by the test  "E(A) = (2B + B/2) / 2 = 1 1/4 B",  otherwise it cannot be admissible. (-:

Please keep in mind that your reasoning about the proportion scale of contents of envelopes A and B to be 1:2 or 2:1 can never change the actual total of (A+B). So my question: could the theorem be based on this total, serving as a corrective? Gerhardvalentin (talk) 17:05, 11 December 2011 (UTC)


 * Let us start at the beginning. Do you mind starting with my simplified example in which we know there is £2 in one envelope and £4 in the other?


 * You say step 1 is wrong. It says, 'Denote by A the amount in the selected envelope'.  My first question is,'What do you take 'A' to be?'. I give some suggestions with comments:


 * 1a It could be a random variable from the sample space (2,4).


 * 1b It could be the unconditional expectation value of the envelope that you hold, in that case, we know its value, it is 3.


 * 1c A constant, an ordinary fixed number. If that is the assumed meaning the paradox is already solved because an ordinary number cannot have two values. The problem is then one of obvious arithmetical equivocation (R1).


 * 1d A constant representing the sum in the original envelope, which might be £2 or £4. No such animal! This is a mangled attempt to define a random variable and again, if this is the assumed meaning, the paradox is resolved right here (R2). Martin Hogbin (talk) 21:33, 11 December 2011 (UTC)


 * Thank you Martin. Did you read there? I don't think I can understand all what you say, but I just KNOW that all steps from 1 to 8 are fully correct if you always choose the envelope with the "determining amount". But if there is symmetry and you do not know which envelope is the determining one, then - no matter what the content of A or B will be - whether there is "any" fixed scale X:Y or any difference (+X resp. -X), the "result" in choosing an envelope randomly and thinking of swapping, then on average you will get your original amount 1:1. No way out, millionfold proven. If you are unaware which envelope is containing the determining amount resp. the dependent amount, then on average the proportion is 1:1. That's a fact. The term "the other envelope contains 2A or 1/2 A with equal robability" is completely effectless if A is not the determining amount and, knowing that it is the determining amount, you intentionally have chosen it for this reason. If you choose an envelope at random, then this term is never valid at all. Then you know from the outset that the other envelope is likely to hold on average the same amount that you have in envelope A. That's a fact, and the reason seems to be that you do not know whether you chose a determining amount (in one of three cases) or a dependent amount (in two of three cases, i.e. including an envelope with the maximal contents). You can show that by experiment that then on average A=B. The term is only valid if you are talking of the determining amount. That's a fact, and that's all for the moment. Regards, Gerhardvalentin (talk) 22:16, 11 December 2011 (UTC)
 * Martin please tell me: What is the real "problem" of the paradox that says "E(B) = 5/4 A"?  What "has to be solved" regarding that obvious paradox? This "5/4 A" is quite correct under condition that the step 2 of the switching argument, saying "The probability that A is the smaller amount is 1/2, and that it is the larger amount is also 1/2" should really be true, i.e. that it is "given" that there really exists the "possibility" of a real "existence" of an envelope containing twice and one containing half. Only if this possibility really is known to be "given". But not if that never might be expected to be given. And we know this can only be true in asymmetry, if the contents of A is known to be the determining amount, forcing B to really "exist" with content of either 2A or A/2 with "probability 1/2" each, as only in this one case the contents of envelope A will be granted to be a fixed amount, with 2B and B/2 basing on that same value. Exactly as "E(B) = 4/5 A" can be true only in asymmetry, if A is known to contain any dependent amount (both taking step 9 ad absurdum). But if there is no "reason" to suppose that there really could exist a second envelope containing twice that amount, it is nonsense to consider it as a "possibility". Then you better say: If A contains the smaller amount, then B will contain twice as much and will contain 2/3 of the total of that pair of envelopes. And if A contains the larger amount, then B will contain half as much as A and will contain 1/3 of that total. So A as well as B in 1/2 of cases are expected to contain 2/3 (A+B), and in the other 1/2 of cases are expected to contain 1/3 (A+B). So A as well as B will have the same expected value. What is to be "solved" in the paradox? Regards, Gerhardvalentin (talk) 15:22, 12 December 2011 (UTC)

The problem is to explain exactly where the error occurs in the proposed line of reasoning. To do this you need to go through it step by step until you can say that a particular step is wrong. So let me start at the beginning again. What is wring with step 1? Martin Hogbin (talk) 22:15, 12 December 2011 (UTC)
 * Thank you for your response, Martin, and I really appreciate your effort. The switching argument says in step 1: "I denote by A the amount in my selected envelope", and this saying in step 1 does expressively imply "no matter whether the content in envelope A will be: (A+B)/3 or A is 2(A+B)/3, from now on I will be going to denote it as 'A'". So long, no "error". But I find it advisable to take this to our attention.


 * But then, step 2 says "The probability that A is the smaller amount is 1/2, and that it is the larger amount is also 1/2.".  And here I firmly have to contradict this statement of "supposedly exactly 1/2 to 1/2", and I have to underline that this is a false, plainly wrong and therefore misleading statement, as this never can nor will be true in a symmetric world, but only in the extremely asymmetric case that you explicitly know that the contents of envelope A has actively been determining the actual contents of envelope B to be 2A or A/2 in 1/2 of cases each, as in Nalebuff's AliBaba example the contents of Alis envelope has been actively determining Baba's envelope to contain either 2A or A/2. Otherwise the statement of step 2 is a pure fallacy, as reality shows. Gerhardvalentin (talk) 01:09, 13 December 2011 (UTC)

Let is just stick with step 1 for the moment. What kind of quantity do you think the writer intends A to be?

1a A random variable from the sample space (2,4).

1b The unconditional expectation value of the envelope that you hold, in that case, we know its value, it is 3.

1c A constant, an ordinary fixed number. If that is the assumed meaning the paradox is already solved because an ordinary number cannot have two values. The problem is then one of obvious arithmetical equivocation (R1).

1d A constant representing the sum in the original envelope, which might be £2 or £4. No such animal! This is a mangled attempt to define a random variable and again, if this is the assumed meaning, the paradox is resolved right here (R2).

1e Something else? Is so what? Martin Hogbin (talk) 18:03, 14 December 2011 (UTC)

Hitherto  "4/5 : 1"  resp. "1 : 5/4". Two indistinguishable envelopes, each of which contains a positive sum of money. One envelope contains exactly one thousend times as much as the other, so for example "1'000 : 1 Million",  or  "1'000 : 1". You know that by swapping you either will increase onethousandfold, or will reduce to one thousandth only, and you are told that, on average,  by swapping you can expect to get the fivehundredfold (1 : 500,0005). Exactly  1:1,  and never  "1 : 500". The theorem's view is in one direction only. In this example, 25 times you will be going to win small amounts, and 25 times will be going to lose small amounts 1:1. And 25 times you will be going to win large amounts and 25 times will be going to lose large amounts 1:1 also, everything completely symmetrical. Regards, Gerhardvalentin (talk) 21:57, 15 December 2011 (UTC)
 * The theorem's view is asymmetric in one viewing direction only.  Viewing directions might make a difference.
 * Here's a variant with a slightly changed basic setup:
 * In 100 "games" you will be going to win 50 times and will be losing 50 times, and after 100 swaps you are holding exactly  1:1  the amount that you already had before swapping.


 * Gerhard, I am trying do do this one step at a time. At the moment I am trying to discuss step 1, as I am doing with iNic.  What kind of mathematical entity do you think the proposer of the paradox intends 'A' to be?  Can we get that settled first. Martin Hogbin (talk) 10:32, 16 December 2011 (UTC)

And take into consideration that the amount of the difference of doubling or halving within this given pair of two envelopes will be exactly the same (amount).
 * Yes Martin, so what signifies "A"?  And what signifies "the larger amount", and what is "the smaller amount"?
 * The reasoning of the swapping argument is fully ignoring the given interdependence of A and B within an unchangeable and inseparably tied pair of two envelopes, comprising together a total of "3X".  The reasoning was better to say:
 * If A=(B/2) then swapping from A to B within this pair will double it to B, but if A=(2B) then swapping from A to B within this pair will halve it to B.
 * So yes, the paradox is a based on ignoring the significant prerequisite of "what is A in a pair of two interdependent envelopes"? That unsolved problem should be solved. Gerhardvalentin (talk) 10:30, 17 December 2011 (UTC)

Gerhard, you are still missing the point. The objective is not to show that you should not swap. That is easy to do and can be done in one word, 'symmetry'.

The objective is to show exactly where the error lies on the proposed line of reasoning. You therefore need to go through the proposed reasoning line by line and stat that this or that line is wrong. Martin Hogbin (talk) 21:45, 18 December 2011 (UTC)


 * The whole switching argumentation is a flimsy error, from step 1 on, talking of A and forgetting of B. Step 9 is formulated as a ridiculous joke, but that insight should be considered already from step 1 on. In the one sided viewing direction of the swapping argument, forgetting about the interdependency of those two paired envelopes, inevitably a picture is presented that - in this form - never exists in general. It pictures exactly only some tiny "subset" of patterns that indeed picture, and exactly suit, for the AliBaba-variant only, where 5A/4 is fully correct. But the world does not exist of such a tiny "subset" of onesidedly arranged "determining A" and "dependent B". This small subset is told to represent the world. In real life you don't have "determining" and "dependent", as the theorem makes us believe to be the only possible arrangement. So it is clear that it shows only one small subset of possible "arrangements", showing only one single aspect. And this aspect is not only valid for one single envelope alone, forgetting about the other envelope, but in every step it simultaneously is equally valid for the second envelope also. Fullstop. If you declare it to be valid for A, then at the same time it is also valid for B. You cannot declare that there is only one possible aspect and neglecting the counter part. As a tiny subset, if admitted to be a tiny subset of all possible arrangements, the argument is fully correct. But to be valid "in general" it lacks the mentionning of the the fact that it is based on one small subset of patterns only, neglecting the equally valid counter-view. So: Each and every step has to say "this is only one possible aspect that is equalized by the counter-view regarding the second envelope. Once more: the argument now shows only one small subset of possible arrangements, without admitting this. It should clearly say that it is only valid in the AliBaba asymmentry, but not "in general" for all possible arrangements. For general validity it should be shown that every step must be compensated by the counter view from the second envelope, and it has to admit that it, in the laconic form, never can be "valid" in general, for all possible arrangements, in a symmetric world. That's the "mistake". Regards, Gerhardvalentin (talk) 22:56, 18 December 2011 (UTC)

Gerhard, it is not a theorem it is a mathematical trick, designed to confuse. Martin Hogbin (talk) 00:29, 19 December 2011 (UTC)
 * You are right Martin, I change it to "switching argumentation". Gerhardvalentin (talk) 05:34, 19 December 2011 (UTC)

The TEP is no paradox
. . . it's the predictable and logical consequence if we infringe the Bayes' rule

Martin, what is your opinion? The switching argument in the TEP never uses the word "asymmetry", but – from step 1 on – always looking into just one direction and until step 8, while completely missing and ignoring the counter-view into the other direction, creates extreme asymmetry. I say that the TEP, from step 1 on chewing on just one envelope only, while completely forgetting about the other is no "paradox" at all. It's just the predictable logical consequence of never strictly following the Bayes' rule. In not using all of our given knowledge, but in a rather sloppy manner purposely using just a small and one sided subset of this knowledge only, and carelessly neglecting the rest of our knowledge. Bayes would turn himself in his grave if he knew that. Concealing that you as well could lose double the amount that could be gained, only in just simple-minded sneaky saying that swapping from A (why A?) to B (why B?) will always gain double the amount that could be lost. In saying so, as a consequence, we can never expect to get an acceptable rational result. Then we will get an imprecise approximate result only, and we will have to face and accept divergency, as a consequence. This means for the TEP that the result of 5A/4 is only fully correct and valid exclusively in the small subset of cases that we exactly "know" for sure that A asymmetrically contains the determining amount while B effectively is to contain any asymmetrical dependent amount. For only in this asymmetric case it is "known" that B really is equally likely to contain 2A as well as the half of that same amount called "A" (A/2) and you will gain therefore twice of what you can lose. Otherwise this never will be "known" but is an unproven and mere "could be"- guesswork-assumption. And, as to be expected, this result is much too imprecise to be valid in general. For if A (as in the TEP) "is not positively known" to asymmetrically contain the determining amount, but also "could" contain a dependent amount, or if nothing is known about "determining or dependent" as in the TEP, then this result never can be accurate, for it never may be given that if you gain, then you in any case will gain twice of what you can lose. Quite the contrary, in this case you also can lose twice the amount that you can win, so leading to symmetry. Quite contrary to the "5A/4"-asymmetry. We have to be accused to already know much more from the outset. We alrady know that either A=B/2 or A=2B, so either B=2A or B=A/2 (i.e. vice-versa). Why didn't we use that given knowledge yet? And, in just only using one small subset of our knowledge and neglecting the rest, we got the obviously strange result of E(B)=5A/4, meaning that E(A)=4B/5. So we see at first glance that this, if valid for B, then it of course must be valid for A also, so E(A)=5B/4 meaning that E(B)=4A/5. Why forgetting about our given knowledge? Comparing these two aberrant unilateral results we see that both of them can never be reliable results, but we can see that both of them are only very approximate results, as both of them differ symmetrical from the middle. And we can see the said "middle" of both also, and this arithmetic mean is exactly "1A" resp. exactly "1B". This arithmetic middle can be supposed to be the exact result. No problem at all to do that. And we can imagine that insufficient priors, in not having used all of our knowledge, could be the reason of those symmetrical deviations. And we could profit from our additional knowledge, and not forget about our additional knowledge that either A is 1/3 of the total amount or A is 2/3 of the total amount, what means that B either is 2/3 or is 1/3 of the total amount, i.e. that exactly the same applies to A and vice versa applies to B. So for both envelopes is valid E(B)=E(A)=(2/3 + 1/3)/2 = 1/2 of the total amount. As a result, both envelopes are to be expected to contain the same value of 1/2 of the total amount. So A:B = B:A = 1:1, exactly the same result as "the middle" before. Where is the paradox? If we do not infringe the Bayes' rule, there is no paradox. And if we do, we have to expect approximate results. That's a long known fact, but this never was a "paradox" at all, that's just normal. Gerhardvalentin (talk) 15:52, 22 December 2011 (UTC)


 * So in which step is the error made? Martin Hogbin (talk) 17:27, 22 December 2011 (UTC)
 * The "error" obviously is the one-sided view starting in step 1 and in (from step 1 to step 8) forgetting that there are TWO interdependent envelopes. Gerhardvalentin (talk) 18:57, 22 December 2011 (UTC)
 * But what is wrong with saying, 'Denote by A the amount in the selected envelope.'? Martin Hogbin (talk) 23:27, 22 December 2011 (UTC)


 * Read what I said above. Step 1 is starting a desperate asymmetric meander in fixing the amount contained in envelope A, as one first step to – as an unfounded conclusion – determine the amount contained in "envelope B" to either be 2"fixed"A (in case that the fixed amount of A should be the smaller amount and pretending herewith that any possible gain will be the double of any possible loss) or to be "fixed"A/2 (in case that the fixed amount of A should be the larger amount, and pretending herewith that any possible loss will only be the half of any possible gain), and in that way unfounded also to determine the magnitude of the total contents of both envelopes "(A+B)", without admitting this, and so forcing the total of "(A+B)" to depend on the "fixed amount of A". Without admitting that asymmetry. That's the real sin.
 * For despite A as well as B in effect both are variably dependent on the magnitude of (A+B), step 1 pretends the contents of envelope A to be a fixed amount, and as a consequence in the following steps B has to be either 2A or A/2, pretending that any possible gain is guaranteed to be quite the double of any possible loss, and any possible loss will be only the half of any possible gain, as in the AliBaba-variant, where the contents of envelope A determines the contents of envelope B. Forgetting that, as to our "knowledge", what applies to A must also apply to B. This one-sided view creates asymmetry, from step 1 on, until step 8. For B could also be said to contain any "fixed amount", say 4 e.g., what - in a one-sided view - would mean then to be the determining amount that is "forcing" envelope A to contain either dependent 8 (possible gain=4) or dependent 2 (possible loss=only 2), and at the same time determining so the pretended "dependent total of (A+B)". Quite contrary to reality, where both the possible gain as well as the possible loss in any case can only be exactly the same amount, just the difference between A and B in any "game" with two envelopes 1:2 or 2:1. In symmetry, by swapping the gain and the loss are the same amount, are just the ''difference between A and B. The swapping argument in the TEP is just a one-sided view only, delivering asymmetry into the one or the other direction. But we do not know anything about "determining or dependent". Plainly nothing. So it is not useful to create "a fixed determining amount and dependent amounts". All we know is that there is a strict interdependency between those two envelopes. Anyone can see that interdependency, and that both amounts are dependent on the magnitude of their total amount (A+B), for either A is (A+B)/3, then B=2(A+B)/3, or A is 2(A+B)/3, then B=(A+B)/3. Both, envelope A and envelope B, do variably depend on that total of (A+B).
 * In creating asymmetry, the result will only be valid in the case that the other envelope B has been filled afterwards, depending on the value of the determining envelope A, i.e. if there was real asymmetry just from the start. Gerhardvalentin (talk) 06:08, 23 December 2011 (UTC)

Again we start from step 1
You say, 'Step 1 is starting a desperate asymmetric meander in fixing the amount contained in envelope A'. If this is the case, in other words we assume the proposer intends 'A' to be a constant, then you are quite right, the paradox is immediately resolved because A cannot have two values. However, maybe the proposer is not as stupid as that. Maybe he intends 'A' to be a random variable, which can take on different values. Martin Hogbin (talk) 11:33, 23 December 2011 (UTC)


 * No Martin, the swapping argument in the TEP clearly says that by swapping you can expect to win the double amount that you could lose, and the swapping argument is adding "A plus A/4 to 5A/4." The swapping argument in step 9 even says that envelope A is to contain 5B/4, please read it. The swapping argument never says that you only can win just the difference of A and B, and it never says that you only can lose just the difference of A and B. The swapping argument, up to step 8, clearly presents a determining amount in envelope A and some dependent amount in envelope B, meaning that on average you will be gaining A/4. You can present another interpretation, and you should correct the mistake of the swapping argument. But until now, there is no "random variable", until now there clearly is the adding of A + A/4 = 5/4A. Gerhardvalentin (talk) 12:12, 23 December 2011 (UTC)
 * At the moment i am talking about just step 1. there is nothing wrong with that step alone if we take 'A' to be a random variable. Martin Hogbin (talk) 17:24, 23 December 2011 (UTC)
 * Okay and yes Martin, you can see it that way. But be careful: in the TEP, step 1 already is "indicating the direction" into the maze. Gerhardvalentin (talk) 17:39, 23 December 2011 (UTC)
 * But that is the whole point of the TEP, the errors in the proposed line of reasoning are quite hard to spot. However I think the first step is fine if you take 'A' to be a random variable. Can you identify exactly which step contains an error?


 * It is no good just saying that the whole argument seems wrong, we all know that, because it comes to an absurd conclusion. Martin Hogbin (talk) 14:16, 24 December 2011 (UTC)


 * I'm no mathematician: The variable "A" of step 1 can have, as to step 1, any possible value. But later A is used as being a "given and known and fixed and invariable amount", in adding 2A/2 (if A is the smaller amount) + (A/2)/2 (out of the case that A is the larger amount) to 5A/4 as in the AliBaba variant. All what I can say now is that step 2 is not wrong if it means to say "the smaller of the actually given both amounts" and "the larger of the actually given both amounts". Okay? Gerhardvalentin (talk) 15:25, 24 December 2011 (UTC)
 * Yes, I agree. In steps 6 and 7 the variable A is used improperly.  Exactly where the error lies depends on exactly what you think the proposer is trying to assert. Martin Hogbin (talk) 16:36, 24 December 2011 (UTC)
 * Yes Martin, used really improperly as for being valid "in general", but just valid only for the AliBaba-version. First of all, I'd like to say that it looks to be intended as a formidable and tremendous joke, like "is it weather under water, if you're there when it rains". Gerhardvalentin (talk) 18:07, 24 December 2011 (UTC)
 * Yes, I think you, Richard, and I all agree that the TEP is not really a mathematical paradox but more like a joke or a conjuring trick. Martin Hogbin (talk) 22:10, 24 December 2011 (UTC)
 * Exactly, but a really perfidious conjuring trick, in always meaning quite the opposite of what we are seduced to misconstrue the saying could be meant to say. I will be going to, step by step, compare the sayings and their permissible meaning, to show that misleading divergency. Please will you help me in doing this in a correct way. Gerhardvalentin (talk) 19:23, 25 December 2011 (UTC)

Analyzing the TEP swapping argument from step 1 to 9
Martin, the swapping argument in the TEP is a very fascinating and tempting joke. And obviously that's quite what the TEP is intended to be. A formidable joke, provoking indignant disapproval, but stranded disapproval. Leaving philosophers as well as mathematicians behind, in just using a very small, unnoticed trick. There is no philosophical or mathematical "problem" at all, there is just a little trick, seducing to inattention.

The TEP is a marvelous joke. From step 1 to 5 valid in general, but from step 6 to 8 only pretending to be valid in general, but indeed valid for the tricky AliBaba-variant only, and in step 9 exploiting anyone that has been taken in and fell by step 6 to 8, to be even more messed on the nose in the last step number 9.

The TEP is "valid in general" from step 1 to 5 only. Whereas from step 6 to 8, without saying so,  no more "valid in general", but valid for the very asymmetric AliBaba-variant only. From step 6 on, never ever "valid in general" any more, never valid any more in a symmetric world. Without saying so, suddenly valid only just for the extreme asymmetric AliBaba-variant, suddenly comprising some "given and fixed and determining amount" contained in envelope A, and some "dependent" amount contained in envelope B.

From step 6 on, no more "large A means small B", or "large B means small A", but suddenly: Irrevocably "newly fixed A determines and causes B"  to either be twice the amount of that newly "fixed A" or to be 1/2 of that "newly fixed and determining A". So, from step 6 on, by swapping no more gaining the same amount that could be lost or losing the same amount that could be gained, but suddenly gaining the double amount that could be lost and losing the halve only that could be gained, in just mapping the AliBaba-variant, without saying so. Adding step 9 as a humorous winking nonsense.

Here an attempt to clarification. For better lucidity based on the assumption of the total amount of both envelopes [A+B] to be just "3". (Could be 12 or 300 or 333 as well).

1.) TEP says: "I denote by A the amount in my selected envelope."
 * meaning: Involving also envelope B, in 1/2 of cases A will be "A (small) =1" and B will be "B (large) =2", and in the other 1/2 of cases A will be "A (large) =2" and B will be "B (small) =1".

2.) TEP says: "The probability that A is the smaller amount is 1/2, and that it is the larger amount is also 1/2."
 * meaning: The probability that A is the smaller amount of "A (small) =1" is 1/2, and that A is the larger amount of "A (large) =2" is also 1/2.

3.) TEP says: "The other envelope may contain either 2A or A/2"
 * meaning: The other envelope B will either be "B (large) =2" containing "2A (small) " only in the case that A is "A (small) =1" or, imagine the contents of both envelopes as swapped, in that latter case B will be "B (small) =1" containing "A (large) /2 = 1", but only in that latter case that A is "A (large) =2".

4.) TEP says: "If A is the smaller amount the other envelope contains 2A", then.
 * meaning: If A is the smaller amount of "A (small) =1" then B is "B (large) =2" containing 2"A (small) = 2". Please regard that everything said about B vice versa is also valid for A.

5.) TEP says: "If A is the larger amount the other envelope contains A/2."
 * meaning: If A is the larger amount of "A (large) =2", then B is "B (small) =1" and contains "A (large) /2 = 1". Please regard that everything said about B vice versa is also valid for A.

6.) TEP says: "Thus the other envelope contains 2A with probability 1/2 and A/2 with probability 1/2."
 * (From now on no more symmetry and no more gaining the same amount that could be lost and losing the same amount that could be gained, but suddenly gaining the double amount that could be lost and losing only half the amount that could be gained. Please consider that this is fully correct exclusively in the AliBaba-variant only, where a determining fixed A "forces" a dependent B to contain either 2A or A/2 of that "fixed A". Otherwise B is equally likely to contain twice the amount of a "small A" and half the amount of a "large A", as it is likely to contain twice the amount of a "large A" and half the amount of a "small A".)
 * meaning correct: Thus the other envelope either is "B (large) =2", containing 2"A (small) " (= 2) with probability 1/2, or  –  imagine the contents of both envelopes as swapped  –  only then B is "B (small) =1", containing "A (large) /2 (= 1) with probability 1/2. And everything said about "the one" envelope, at the same time vice versa is also valid for "the other" envelope.


 * Note: Obviously here, in step 6, in no more comparing  "known to be given  1+2"  or  "2+1"  or  "x+y"  or  "y+x",  but in, without evidence and unfounded and "totally unseen" suddenly inventing the scale of "fixed determining 2 : dependent 4"  (where 4 only admissible if a "determining 2" forces some "dependent 4" to possibly exist)  and of "fixed determining 1 : dependent 1/2"  (where 1/2 only admissible if a "determining 1" forces some "dependent 1/2" to possibly exist) creates the pure "AliBaba-constellation" of some fixed "determining A : dependent B = 1 : 5/4",  that never will be valid in symmetry. After leaving the correct route that either one envelope is large what requires the other one to be small, or vice versa, but quite unfounded and unnoticed suddenly inventing the strange and false variant that envelope B "is known to" having been filled later, regulated by the "already given" fixed contents of envelope A, with either 2 fixed A or with A fixed /2 now the TEP is giving birth even to "2A large ", eventually provoking infinity, as well as to "A small /2",  that never might have been existing. Leading from the correct scale of  "(1  1/2) : (1  1/2)"  say  "1 : 1"  to the luring determining/dependent scale of  "1 : 5/4".  Very alluring, indeed, but valid only in the "explicitly to be known" extremely asymmetric AliBaba-variant,  but nowhere else in symmetry.
 * Note: only an "already fixed determining 2" can force any "dependent 4" to ever possibly exist, and only an "already fixed determining 1" can force any "dependent 1/2" to ever possibly exist. Otherwise no evidence whatsoever that "4" or "1/2" ever could be existing indeed. Excluding "infinity" as a reasonable expected content.

7.) TEP says: "So the expected value of the money in the other envelope is 1/2 x (2A) + 1/2 x (A/2) = 5A/4". (Please notice that this is fully correct only in the AliBaba-variant.)
 * meaning: As long as no additional information is available, the expected value of envelope B is 1/2 x [2"A (small) " (= 2)]  +  1/2 x ["A (large) /2 (= 1)] = (1/2 x 2) + (1/2 x 1) = 3/2 A. Please regard that what is said about "the one" envelope, vice versa is also valid at the same time for "the other" envelope, so vice versa the expected value of envelope A is 3/2 B also. Scale A : B = unchangeable 1 : 1.

8.) TEP says: "5A/4 is greater than A, so I gain on average by swapping.". (Please notice that this is fully correct only in the AliBaba-variant.)
 * This obviously is a wrong conclusion, as the amounts of A (small) =1 and A (large) =2 have been "added together" improperly to give "fixed 5A/4", leading to the AliBaba-variant. This conclusion is only valid in a very special "known" extreme asymmetry as in the AliBaba-version, but nowhere else in symmetry, and valid in one direction only, here "from A to B".

9.) TEP says: "After the switch, I can denote that content by B and reason in exactly the same manner as above."
 * This obviously is a plain joke, as the 5A/4 of some AliBaba-version works in one direction only, here "from A to B", whereas the view from B to A in that case gives E(A) = 4B/5.

Martin, please could you check this attempt of comparison. Regards, Gerhardvalentin (talk) 01:06, 26 December 2011 (UTC)
 * Gerhard, I am not qualified to check anything of yours but if you look at my solution, on this user page, you will find that it is seems similar to yours.  Steps 1-5 can all be understood to be correct in some way but in steps 6 and 7 are where it has to fail. Martin Hogbin (talk) 11:12, 26 December 2011 (UTC)

The switching argument corrected, misleading ambiguity apart
Martin, please can you check this? Does this corrected version clearly spot the "misty mistake"?

1. I denote by "A" the amount in my selected envelope.
 * Why just only "A"?  –  Already preparing the "asymmetric joke"?  Better:  I denote by "A" the amount in my selected envelope, by "B" the amount in the unselected envelope, and by "(A+B)" the yet unknown, but from now on given and unchangeable total of these two strictly interdependent and interconvertible amounts.

2. The probability that "A" will be the smaller amount is 1/2, and that it is the larger amount is also 1/2.
 * Under the condition that "B" should be the larger amount of "2A", only then "A" can be the smaller amount. Otherwise not. But given the case that "B" should be the smaller amount, only then "A" can be the larger amount of "2B". But never otherwise. We may assume that both cases are equally likely  1/2 : 1/2.

3. ''The other envelope may contain either "2A"' or "A/2".
 * Envelope "B" can and may contain "2A" solely under the strict condition that "A" is the smaller amount of "B/2", never otherwise. But given the case that "A" should be the larger amount, only then "B" can and may contain "A/2",  otherwise not.

4. ''If "A" is the smaller amount the other envelope contains "2A".
 * Under the condition that "A" should be the smaller amount, only then the other envelope can and will contain "2A", but never otherwise.

5. ''If "A" is the larger amount the other envelope contains "A/2".
 * Under the condition that "A" should be the larger amount, only then the other envelope can and will contain "A/2", but never otherwise.

6. ''Thus the other envelope contains "2A" with probability 1/2 and "A/2" with probability 1/2.
 * Ouch, that really hurts and sounds a little weird because it implies asymmetry. And saying so inevitably yields to appalling misreading. Better: Thus the other envelope can contain "2A" under the strict condition that "A" is the smaller amount of "B/2", but otherwise not. And the other envelope may contain "A/2" only in the case that "A" should be the larger amount, but never otherwise. We may assume that both cases are equally likely.

7. So the expected value of the money in the other envelope is "5A/4".
 * Ouch again, because saying so implies that this could eventually be valid "in general", but such claim is a cheeky lie as it is contrary to the facts. This expectated value of "5A/4" is only valid in the extremely asymmetric "AliBaba-variant" of Nalebuff for example, where a strictly "determining A"  is known to force  a completely "dependent B" to, on average, contain exactly "5/4A", if strictly just "that" envelope "A", knowing it is the determining envelope, will be taken (not "selected"). Not selected at random from two indistinguishable envelopes like the TEP says, but the "known to be determining" envelope will be taken and will be called "A". The same applies if you put always the "double" amount of A into the other envelope and exclusively take "that known to be determining envelope A", then by swapping to B you surely will get 2A in any case, of course. But in the TEP, in symmetry, you have to choose your envelope at random. And if you strictly put only 1/10 into the other envelope, and "take" (not "select") the known to be determining envelope, then by swapping to the other envelope you surely will get A/10 only. In any case, and you know that for sure. You even can put the hundredfold amount of some "determining" envelope into the other, the dependent envelope. If you exactly know which envelope is the determining one and take it with intent, then by swapping to the known to be dependent envelope you will get 100A for sure. No doubt. But this is never valid in symmetry, when the envelopes are indistinguishable and your envelope is selected randomly, as in the TEP. Where such extreme asymmetry never can be supposed nor will ever be "well known" and the two envelopes are said to be "indistinguishable". In symmetry the scale of A:B on average will always be 1:1, that's a proven fact. The TEP should have underlined that the asymmetric result of 5A/4 is correct in the "AliBaba-variant" only, where it is "known" which envelope is the determining one, but valid nowhere else. Never if both envelopes are indistinguishable.  Better: So the expected value of the money in the other envelope is "(2A (small) + A (large) ) /2)" and, in symmetry, the expected scale "A:B" on average will always be exactly "1:1"

8. This is greater than "A", so I gain on average by swapping.
 * Correct: This ("1A") is not greater than A.

9. After the switch, I can denote that content by B and reason in exactly the same manner as above.
 * Extreme nonsense-joke, because the scale "A:B"  of correctly  "1:5/4" in the AliBaba-variant means in return "B:A" to be  "1:4/5".

10. I will conclude that the most rational thing to do is to swap back again.
 * no more comment to this joke

11. To be rational, I will thus end up swapping envelopes indefinitely.
 * no more comment to that joke

12. As it seems more rational to open just any envelope than to swap indefinitely, we have a contradiction.
 * no more comment at all.

Please Martin tell me what you think of this "corrected" version. Thank you, and regards, Gerhardvalentin (talk) 01:00, 30 December 2011 (UTC)

Discussion with iNic
Sorry to intervene, but A is the actual amount in the chosen envelope. Exactly as stated in the article. Why would that suddenly be more than one value? Each envelope has one value only. I would be very surprised if I saw two values in one envelope. iNic (talk) 03:55, 15 December 2011 (UTC)


 * iNic, you are welcome to join the discussion. 'A' is a symbol, a letter from the Latin alphabet.  Before we can decide what values it can or cannot take and what mathematical operations can be performed upon it we must decide on what kind of mathematical entity we think the proposer of the paradox intends 'A' to represent.  The remaining steps in the proposed argument cannot be critically examined if we are not clear on exactly what kind of mathematical entity 'A' is intended to be.


 * Are you proposing that 'A' is intended to be a constant as in 1c above? Martin Hogbin (talk) 10:32, 16 December 2011 (UTC)

Thank you! Well if the use of latin symbols such as A and B confuses you, please restate the whole argument without any symbols. That is easy. This shows that symbols are not essential for TEP at all. It follows that their status as being of one kind or another is totally irrelevant. iNic (talk) 16:20, 16 December 2011 (UTC)


 * It is not the symbols that are the problem it is their meaning. The objective is to find the error in the given line of reasoning. This starts with 'Denote by A the amount in the selected envelope'.  What exactly does the proposer mean by this? Martin Hogbin (talk) 16:36, 16 December 2011 (UTC)

What I meant was that to properly solve a problem you have to focus on what is relevant to the problem and ignore everything else. The use of symbols like latin letters is obviously not central to the problem, as it is easy to state all the steps in the TEP reasoning without A's and B's:


 * 1) I take one of the two envelopes at random.
 * 2) The probability that I selected the smaller amount is 50% and that I selected the larger amount is also 50%.
 * 3) Due to the rules of the game the other envelope must contain either half or twice of what I have.
 * 4) If I have selected the smaller amount the other envelope must contain twice as much.
 * 5) If I have selected the larger amount the other envelope must contain half as much.
 * 6) It is thus a 50% chance that the other envelope contains twice of what I have, and 50% chance that it contains half of what I have.
 * 7) So if I switch, in half of the cases I will double my profit and in the other half of the cases I will get half of what I have.
 * 8) On average this gives me a net profit of 25%, so I gain on average by swapping.
 * 9) After the switch, I'm free to repeat the same reasoning again.
 * 10) I will conclude that it is profitable to swap back to my first envelope again.
 * 11) If this is rational, I will end up swapping envelopes indefinitely.
 * 12) But as it seems more rational to open just any envelope than to swap back and forth indefinitely, we have a contradiction.

To repeat: The latin letters were introduced in an attempt to make the reasoning clearer. If they confuse more than what they help they should be removed. iNic (talk) 01:57, 17 December 2011 (UTC)


 * Firstly, you have elected not to resolve the paradox as proposed in the article, which starts, 'Denote by A the amount in the selected envelope' but to propose a paradox of your own. Even if you can resolve your paradox you will be faced with the task of showing the two to be equivalent.

I hope you have noticed that there are several different versions of TEP, both in the article and in the published literature. There is nothing holy about the wordings in the article. This is not a paradox of my own. We don't get a totally new paradox if some words are replaced with others that means the same. You really think that this version is something different than the one with A's and B's? iNic (talk) 11:25, 19 December 2011 (UTC)
 * Yes, but we are writing an article that starts with a specific problem statement.


 * Secondly you still seem to be trying to use some form of mathematics but mixed in with a natural language description of the paradox.

Mathematics can only be done using a natural language. Didn't you know that? The use of mixed in symbols, however, are optional. iNic (talk) 11:25, 19 December 2011 (UTC)
 * Yes, I agree but mathematics can save a lot of words.


 * The biggest problem, however, is that natural language is very poor at describing complex situations precisely. That is why we have courts and lawyers who can make a living by arguing about what words mean.  By choosing to frame the question in natural language, every word is open to question and examination as to its precise meaning.   The advantage of mathematical symbols and notation is that they encapsulate precise and agreed notions.

Wow, don't you know that the early probabilists tried to put probability theory in use for designing the perfect judicial system? It failed big time. No one has tried to do anything similar ever since. And that for good reasons. iNic (talk) 11:25, 19 December 2011 (UTC)


 * So, to start, what exactly do you mean by, '50% chance that the other envelope contains twice of what I have'? Under exactly what circumstances is this true?  Martin Hogbin (talk) 09:43, 17 December 2011 (UTC)

What this says is this. Let's say that I have 512 monetary units in my selected envelope. The other envelope then has either 256 monetary units or 1024 monetary units. As I picked my envelope at random and have no reason to think that any number is more likely than any other, it is equally likely that the other envelope contains 1024 MU as it is that it contains 256 MU. And equally likely is the same as a chance of 50% each. What do you think it means? iNic (talk) 11:25, 19 December 2011 (UTC)


 * It is still not clear what you mean. Do you mean that for every possible sum that you have in your envelope the probability that the other envelope holds 1/2 that sum is 1/2 and probability that the other envelope holds twice that sum is also 1/2, or are you saying that if you have 512 in your envelope the probability that the other envelope holds 256 is 1/2 and the probability the other envelope holds 1024 is 1/2? Martin Hogbin (talk) 19:26, 20 December 2011 (UTC)

Both. Your two interpretations are equivalent. iNic (talk) 00:25, 21 December 2011 (UTC)


 * If, for every possible sum that you have in your envelope the probability that the other envelope holds half that sum is 1/2 and probability that the other envelope holds twice that sum is also 1/2 then you should swap, there is no paradox.

There sure is a paradox if this is true whatever of the two envelopes you take, which is the case here. iNic (talk) 23:55, 21 December 2011 (UTC)


 * This situation can easily be arranged. Take an arbitrary (finite) sum of money, put it in an envelope and give it to the player.  Then toss a fair coin and, according to the fall of the coin, either put half the sum or twice the sum in another envelope. The player the has a chance of 1/2 of getting half his original sum and a chance 1/2 of getting by twice his original sum by swapping.  All the steps in the proposed argument for swapping hold good and the player should swap, once.  However, this is not the TEP.

Correct, this is not TEP. Why mention it at all? iNic (talk) 23:55, 21 December 2011 (UTC)


 * Unfortunately this situation cannot be arrived at by allowing the player to initially choose between two already filled envelopes (without the expectation in both envelopes being infinite, in which case whether you swap or not makes no difference). Martin Hogbin (talk) 19:12, 21 December 2011 (UTC)

Sure it can. You don't seem to have understood what bayesian probability is all about. In Bayesian probability theory all that matters is your own ignorance, or degree of ignorance. As you don't know what the other envelope is in the TEP case, there are only two possibilities, and you have no reason to believe in one amount more than the other, you typically assign both alternatives probability 1/2. This does not mean that in 50% of the cases one of the possible amounts will show up. These probabilities describe my credence, what I believe will happen given the information I have. Remember, the TEP offer happens only once so nothing is repeated anyway. The bayesian principles does indeed lead to a paradox as it doesn't matter what envelope you start with (of the two), the principles say that it's rational to trade for the other one no matter which one you picked from the beginning. You (as well as a great many authors of published articles) thinks that a Bayesian probability assessment can be valid only if we can imagine some mechanism that creates the bayesian probabilities but now as frequentistic probabilities; limits of relative frequencies over time. Well, that is of course nonsense. If that would be the case there would be no need to invent Bayesian probability in the first place. According to Bayesian philosophy, different persons, even having exactly the same amount of information about a situation, will typically assign different bayesian probabilities to the same event or statement. This is completely natural and nothing strange at all in the Bayesian setting, but it is very strange or odd if we talk about frequentist probability. There are only two things that are really forbidden in Bayesian probability, and that is to not use Bayes rule when updating your set of beliefs or to be caught being irrational. (Many Bayesians think that these two sins are really the same sin.) So the strange part in the TEP line of reasoning for the Bayesian is not step 6 or 7 but the last four steps when the Bayesian reasoning leads to seemingly irrational behavior. This is the contradiction the Bayesian has to solve. This is what TEP is all about. iNic (talk) 23:55, 21 December 2011 (UTC)


 * I am well aware of the Bayesian and frequentist models of probability, in fact I would describe myself as a strong Bayesian. When the two approaches give different answers to the same problem this is usually (always?) because all the aspects of the problem have not been properly expressed in one or both of the formulations; both methods should give the same answer to any given problem.


 * Speaking then as one Bayesian to another, can we start by considering a frequentist approach to the problem. We need to set up a situation in which there are two envelopes containing money, one containing twice the sum in the other.  This is easily done but there is no way to do this such that the probability of picking the larger sum is independent of the sum in the envelope that you pick, unless we start with a distribution that has an infinite expectation.  Do you agree? Martin Hogbin (talk) 10:29, 22 December 2011 (UTC)

Did I say I'm a Bayesian? I don't think so. Anyway, that is totally beside the point. As TEP is a unique situation there is no frequentist approach. Frequentists can't apply probabilities to single events as Bayesians can. It is thus not the case that the two methods always gives the same answer to any given problem. That can happen only in rare cases. Often Bayesians can give an answer to a problem when frequentists can't. The differences are fundamental. It astonishes me that you don't know this when you say that you are well aware of the differences between the philosophies.


 * The question does not actually make clear whether this is a one-off event or whether it is to be repeated but we can certainly envisage the situation being repeated many times so that a frequentist approach can be used. When set up with care a frequentist approach should give exactly the same answer as a Bayesian one, but if you do not want to take a frequentist approach to this problem that is fine with me. Martin Hogbin (talk) 12:55, 22 December 2011 (UTC)

This is definitelty a once in a lifetime situation to get money for free. No information about how to repeat the situation is given. All probabilities are inferred by symmetry arguments, not frequency arguments. If you invent a way to repeat it you for sure need to add more assumptions on how to do that. But as you do that you are starting to solve some other problem than the original TEP. This is very common in the literature. Authors don't know how to solve TEP so they add some additional set of assumptions S. Then they solve TEP+S by finding strange things in S. But TEP without S is still unsolved. iNic (talk) 13:33, 22 December 2011 (UTC)

Bayesian approach
So what now about a Bayesian approach to the problem? The WP version of the problem starts with, ''Let us say you are given two indistinguishable envelopes, each of which contains a positive sum of money. One envelope contains twice as much as the other. '' What are we to make of this? What information do we take it that this statement gives us? Interpretation of this statement is the first step in any Bayesian solution. We can say that the probability that we will pick the larger sum is 1/2 but that is about all without further interpretation.

If we put our minds to it we can think of all sorts of bizarre interpretations. Is the sum of money a real number? A finite number? Are we assumed to always desire to have more money? Are there limits on what the sums might be? Until you answer these questions there is not even a paradox to resolve. How do you propose that we answer these questions? Martin Hogbin (talk) 10:29, 22 December 2011 (UTC)


 * Sure go ahead and imagine bizarre interpretations if it helps you to solve the problem. If it doesn't help you I would recommend you to stick to as natural assumptions as possible. If you don't believe it's possible to play this game for real I would love to offer it to you. I offered it to Gill quite some time ago but he hasn't accepted the offer yet. So I guess we have here an example of the bizarre case you mentioned, that we can't always assume that otherwise rational people desire to have more money. But how about you? Are you always rational? iNic (talk) 11:32, 22 December 2011 (UTC)

My point was simply that until you have decided exactly how to interpret the problem there is no paradox to resolve. I am perfectly happy to stick to natural assumptions. So here are my natural assumptions:
 * 1) The sums that might be in the envelopes are real, finite, and bounded.
 * 2) We assume that, regardless of the amount, the player wishes to maximise the money that they will receive.
 * 3) We assume that the player is not risk-averse and will act upon their estimation of the expectation in the other envelope.

Do you agree so far? Martin Hogbin (talk) 12:55, 22 December 2011 (UTC)


 * Sure, we can play such a game if you want. Want to play? iNic (talk) 13:13, 22 December 2011 (UTC)

Are you saying that, with the above assumptions, for every possible sum that you have in your envelope the probability that the other envelope holds half that sum is 1/2 and probability that the other envelope holds twice that sum is also 1/2. Martin Hogbin (talk) 14:14, 22 December 2011 (UTC)


 * Yes. But please keep in mind that the probabilities are of the bayesian type. As you are a hard core Bayesian this should not be a problem, right? iNic (talk)

Your claim cannot be true. The player assumes the sums in the envelopes to be bounded (as agreed above) so he knows that if the sum in his chosen envelope is over half the maximum possible value the probability of doubling his money by switching is 0. Martin Hogbin (talk) 17:24, 22 December 2011 (UTC)


 * You are "the player" right? Please talk about yourself in terms of "I" and "me" instead. Nice strategy you have there, but which amount is the maximum possible value according to you? All the money on earth? How much is that? Or perhaps all gold in the universe? How much is that? Even if you work out your own philosophy regarding "the maximum amount" I'm sure you won't be able to put it into use when we play for real. So do you want to play for real or not? I'm literally giving away money. If you still don't want to play you are violating your second natural assumption above. How can you do that? iNic (talk) 01:59, 23 December 2011 (UTC)

OK, I am the player. I believe that there must be an upper bound on the sums that can be in the envelopes but I do not know what that upper bound is. I cannot therefore make the assertion that, 'for every possible sum that I have in my envelope the probability that the other envelope holds half that sum is 1/2 and probability that the other envelope holds twice that sum is also 1/2' because I know that there will be some sums for which the probability that I will double my money by switching will be 0. I do not know what these sums are but I know that they must exist and that I could possibly be holding such a sum. Martin Hogbin (talk) 11:20, 23 December 2011 (UTC)


 * Don't worry about that. I will make sure that won't happen. So do you want to play or not? If you don't want to play you are bizarre according to your own definitions, as you violate your own natural assumption number two above. iNic (talk) 12:14, 23 December 2011 (UTC)

I have no idea what you mean by 'do I want to play or not?'. We are talking about resolving a paradox proposed by several people in which a seemingly logical line of argument leads to an absurd conclusion. The objective is not to show that you should not swap but to find the flaw in the proposed line of reasoning, thus resolving the paradox. Martin Hogbin (talk) 17:22, 23 December 2011 (UTC)


 * I mean if you want to play TEP or not. Basically you will get money for free. You will get whatever amount there is in the envelope you open. You really find this offer hard to understand? Unbelievable if you ask me. Small kids would understand this offer. How old are you? iNic (talk) 18:29, 23 December 2011 (UTC)

No, of course I do not want to 'play the TEP'! Why would I want to do that? How do you play a paradox? Martin Hogbin (talk) 20:37, 23 December 2011 (UTC)


 * This is what I wanted to show you. But if you don't want you will never know. Please remove assumption number 2 from your list of naturtal assumptions. You can't believe a general principle to be natural and valid for everyone if it doesn't apply to yourself. iNic (talk) 20:59, 23 December 2011 (UTC)

I still have no idea what you are talking about. What does 'play the TEP' mean? What exactly are you asking me to do? Martin Hogbin (talk) 22:28, 23 December 2011 (UTC)


 * You only have to say 'yes' and then follow my instructions. iNic (talk)

I would be happy to play your game if it will serve some useful purpose or elucidate some point of disagreement between us. Perhaps you could give a clue as to the point that you are trying to make. Martin Hogbin (talk) 13:31, 24 December 2011 (UTC)


 * OK so maximizing monetary return apparently doesn't give you motivation enough for playing the game. That's interesting in itself. Your natural assumptions maybe aren't that natural after all... You said above that my claim can't be true, that it's possible to play the game. So I'm now trying to show you that it is indeed possible to play it. Is this interesting enough for you to engage in a play, now when money can't make you interested? iNic (talk)

I said your claim cannot be true but is did not say that it is not possible to play the 'game' because I have no idea what 'the game' is or what you are trying to demonstrate with it. Please just tell me, it would be much easier. Martin Hogbin (talk) 10:21, 26 December 2011 (UTC)


 * It can only be shown by playing it, which is very easy by the way. But if you don't want you don't want. iNic (talk) 22:24, 26 December 2011 (UTC)

Just tell me what it is you are trying to prove. Martin Hogbin (talk) 00:42, 27 December 2011 (UTC)


 * That you are wrong. iNic (talk) 12:26, 27 December 2011 (UTC)

About what? Martin Hogbin (talk) 15:34, 27 December 2011 (UTC)
 * Ah never mind. This is getting silly. We have entered an infinite loop. Obviously you are not interested in playing the game so let's just forget it. iNic (talk) 03:17, 28 December 2011 (UTC)

The TEP in a nutshell
Three young chaps go into a restaurant. They buy a meal for £30, so they pay £10 each and then leave.

The manager then realises they have been over-charged by £5, so sends the waiter after them with the money. The waiter keeps £2 as a tip, and gives each chap £1 change.

That means, they each paid £9. 3 x £9=£27 The waiter's tip is £2, making £29, so where has the other £1 gone? Martin Hogbin (talk) 11:18, 26 December 2011 (UTC)


 * Perfect Martin, this old joke looks like exactly the same method of reasoning as the swapping argument in TEP. Btw, the correct total price was £ 25, wasn't it? But the three guys together in effect did pay more than £ 25, for each of them paid £ 9, so together £ 27, and from this paid amount of in total £ 27 the waiter kept £ 2 as a tip, while the manager of the restaurant got – resp. kept – the rest of £ 25, exactly the amount of his demand. If I remember right, that was the answer to that joke, okay? As said, looks just like steps 6–9 of the TEP. Regards, Gerhardvalentin (talk) 14:16, 26 December 2011 (UTC)
 * Yes, the whole puzzle is essentially a non-sequitur dressed up to look like a logical argument; just like the TEP. Martin Hogbin (talk) 10:03, 27 December 2011 (UTC)