Talk:Two envelopes problem/Arguments/Archive 5

Richard's change to the problem in the lead
Richard you have changed the problem statement in the lead to:

Of two indistinguishable envelopes, each containing money, one contains twice as much as the other. The subject may pick one envelope and keep the money it contains. Having chosen an envelope at will, but before inspecting it, the subject gets the chance to take the other envelope instead.

Problem: what is the optimal rational strategy for maximising the amount of money to be gained?

As stated there is no problem, that is what I meant when we talked about the TEP being a self inflicted injury.

The optimum strategy for your problem is:

1) Keep your envelope.

2) Thank the envelope provider for the money.

I do not think anyone seriously challenges this Martin Hogbin (talk) 13:41, 8 November 2014 (UTC)


 * Indeed. The TEP paradox is not to solve the problem formulated in the lead but to show what is wrong with the argument for switching, which only came later. I did not touch the lead, I did not even look at it. I tackled the introduction, which came later. Weird. Whoever set the article up like that had absolutely no clue what TEP is about. I made necessary changes to the intro but I missed the lead. Richard Gill (talk) 14:11, 8 November 2014 (UTC)


 * I fixed the lead. Better now? Richard Gill (talk) 14:17, 8 November 2014 (UTC)


 * Sorry Richard, I noticed that you had none a lot of editing recently and assumed that you had made the change, which surprised me somewhat, I thought you had gone a bit crazy. I should have checked. Once again my apologies.


 * It is better now but I think it was better how it was before. I think we should sort out the body of the article first and then make the lead a summary of the article as it should be. Martin Hogbin (talk) 01:59, 9 November 2014 (UTC)
 * PS. I think I also saw this comment by you, 'The Introduction does not introduce the Problem. (did not! - I have changed it)'. Which made me think you had made the above change. Martin Hogbin (talk) 12:29, 9 November 2014 (UTC)
 * I think one does this work cyclically. Now it is time to go back to the body of the article and improve that. Then it will be time to go back to the lead. And so on ad infinitum. But it is important to establish that the article *starts* with some kind of canonical TEP.
 * Canonical TEP: There is no looking in envelopes. There is no information about possible amounts of money in the envelopes. The envelopes are filled with two different, positive amounts, one twice the other. You are allowed to pick an envelope completely at random. You do. You are *not* allowed to look in it. You *are* asked if you would prefer to have the other.


 * I agree. It would be good to describe that version as the canonical version if we have a source saying that. I would use a different word though 'formal/official/standard'. Martin Hogbin (talk) 12:22, 9 November 2014 (UTC)
 * The present article refers to a paper by Ruma Falk for the statement of the core / basic / standard TEP. You could read her paper and find out why she takes it as her starting point. Richard Gill (talk) 15:09, 9 November 2014 (UTC)
 * Richard, the description in the 2008 Falk paper is taken from Wikipedia! Do we have a source that does not refer back to WP? Martin Hogbin (talk) 19:35, 9 November 2014 (UTC)
 * Quite a few people (including Nalebuff) refer to Sandy Zabell, a Bayesian statistician, who has the Bayesian solution (of course) and says that he finds it hard to explain to non-mathematicians. Of course, Nalebuff did introduce his own new Ali-Baba game but he does refer to Zabell. Zabell thinks of both players looking in their envelopes and both desiring to switch. Nalebuff mentions that they do not even have to look in their envelopes to follow the (false) conditional probability reasoning and decide to switch, because they would wish to switch whatever they saw there. Falk (2008) says explicitly that she discusses Wikipedia's problem and Wikipedia's first solution. Nickerson and Falk (2006) studies a whole range of variations including our "canonical" TEP. They mention that the switching argument does not require one to look in the envelope, and they do know the Bayesian resolutions. Richard Gill (talk) 11:56, 14 November 2014 (UTC)


 * Just like Monty Hall problem, this is a briefly sketched "real world" situation. We have to make a decision. There is some built-in randomness in the problem - picking an envelope; there is also uncertainty or lack of knowledge or ignorance in the problem (what amounts were in the two envelopes to start with?). Informally, we use "probability" both for physical randomness and for uncertainty (we mix ontological and epistemological concepts of probability). As the solution to the necktie paradox makes clear, we have to distinguish between (a) the actual, unknown, amounts in the two envelopes; a and b; which are either equal to x and y = 2x or to y and x respectively, and which never change throughout the whole game, but which are not known; and (b) our knowledge about these amounts. Our knowledge about these amounts changes whenever we get some information. We can also *imagine* being given information (e.g. "there is $20 in Envelope A") and we can imagine how this changes our knowledge.
 * Whatever else is going on in the paradoxical (false) argument to switch envelopes, it is clear that (a) and (b) are being mixed up. *If* we want to "solve" the paradox by casting it in the language of the probability calculus (mathematics), which works just as well for ontological as for epistemological randomness as for a combo, the first thing we should do is to introduce mathematical notation which makes the distinction between (a) and (b) clear. Conventionally, this is done by using capital letters A, B, X, Y to stand for "random variables" and lower case letters a, b, x, y to stand for possible values which they can take.
 * The problem is a conjuring trick we should nat ask what was in the writer's mind when he gives the bogus line of argument for swapping but what for of misdirection he is trying to create in the mind of the audience. I still find it hard to believe that the presenter is trying to put in our minds a mangled calculation of E(B).  We are specifically told about a (speculated) sum in the original envelope so I think it is much more likely that he is trying to confuse us with a muddled calculation of E(B|A=a).  Are ther any sources, apart from you, that state that the given line of reasoning for swapping is intended to be a calculation of E(B)? Martin Hogbin (talk) 12:22, 9 November 2014 (UTC)
 * It wasn't my idea to suppose that the presenter is trying to put in our minds a mangled calculation of E(B). In my paper I call it the philosophers' solution because I came across it in Schwitzgebel and Dever (2007?) and a whole lot more articles in philosophy journals. Of course these people use pages and pages of words, and tend to mangle all their attempts at mathematics, in particular through inadequate notations. And they don't survey all the literature but just focus on a little corner of it. So they don't even realise that they are interpreting the problem differently from how mathematicians and statisticians had, earlier, interpreted it. But not all mathematicians take the same approach, either. For instance Ruma Falk, who works on the pedagogics of teaching elementary probability, also has a paper in which she uses the philosopher's interpretation (and another one where she takes the "mathematician's interpretation". Richard Gill (talk) 15:06, 9 November 2014 (UTC)


 * I think it is wrong to call the (presently) second solution a "Bayesian solution". It does not depend on using subjective probability. It works just as well for frequentist probability. It is nothing to do with Bayesianism. It does have to do with conditional probability and conditional expectation value, but that was already the case in the (presently) first presented solution.


 * I agree, in fact I think this should just be presented as the solution. The other one should be presented as popular/philosophical solution.


 * Right now the second solution is presented in a much too complicated way because the writer is thinking of a whole lot of variations at the same time (e.g. variations like Nalebuff's in which one does look in the envelope). The "Bayesian" solution of the canonical problem ought to be presented first, and after that, one can start moving on to variations of the problem and to new variations on the solution. Richard Gill (talk) 09:08, 9 November 2014 (UTC)


 * I agree again. One remaining problem is how to present a simple mathematically sound but non-mathematical description of that resolution.  Martin Hogbin (talk) 12:22, 9 November 2014 (UTC)


 * An interesting quote from Gardner is, 'If we construct a payoff matrix as Kraitchik does in his book we see that the game is symmetrical and does not favor either player. Unfortunately, this does not tell us what is wrong with the reasoning of the two players.  We have been unable to find a way to make this clear in any simple manner'.


 * There is also a hint that Gardner had the E(B|A=a) calculation in mind as he says, 'Does this paradox arise because each player wrongly assumes his chances of winning or losing are equal?'.

Martin Hogbin (talk) 13:19, 9 November 2014 (UTC)


 * It is easy to explain if you can use standard probability calculus notation and if the reader understands conditional expectation (and the relation between conditional and unconditional). The reader who doesn't have this background should not be reading deeper into this article. They will have to be contented with a verbal summary, as in the two neckties. It is a verbal summary which covers both of the first two interpretations. The same mixup is being made, either way. It is not wikipedia's job to explain basic elementary probability theory to its readers ... at least, not in an article which is not devoted explicitly to that topic. Richard Gill (talk) 15:14, 9 November 2014 (UTC)
 * Richard, I think a verbal summary which covers both interpretations is no summary at all but just confused verbage. I think we should try to explain the E(B|A=a) resolution in words that most readers can understand.
 * Assuming the possible sum in the envelopes is bounded seems a good place to start to me and is exactly the situation described by Gardner. He says, 'We may assume that each player has a random amount of money from zero to any specified amount, say $100'.  Martin Hogbin (talk) 17:02, 9 November 2014 (UTC)

Or A can be the "large amount" of 2(A+B)/3, then the "small B" will be (A+B)/3. In any "ONE" pair of envelopes, even if the value of A and the value of B both are unknown, their total is unknown but UNCHANGEABLE. The same applies to the actual total amount of (A+B) in case that A is known (no difference). But the problematic formula (valid only for AliBaba) mixes two different "games" of TWO possible pairs of envelopes and treats both as if they were one single pair of envelopes. Term 1 addresses pair X and Term 2 addresses pair Y. Although any single pair of envelopes has got a given total amount of "A"+B, the formula uses two different total amounts of "A"+B large (of game X) and another of "A"+B small (of game Y), i.e. two different values of B, resulting in a given total amount representing "3A" and, at the same time, in a given total amount representing "3A/2". Saying it is only ONE pair of envelopes. This is correct for the AliBaba variant, but weird for the standard version of the TEP. – Btw, the actual amount of A is unimportant. (I hope I made no typos.) See my "picture". Gerhardvalentin (talk) 18:41, 9 November 2014 (UTC)
 * The article should say what the sources say. The formula $${1 \over 2} (2A) + {1 \over 2} \left({A \over 2}\right) = {5 \over 4}$$  (improper for the standard problem) is just only an incomplete fragment that perforce leads to false conclusions. As per the sources, this mistake is adding apples and oranges, because the first term $${1 \over 2} (2A)$$ is based on the claim that actually (A+B)=3A, whereas the second term $${1 \over 2} \left({A \over 2}\right)$$ is based on the claim that actually (A+B) = 3A/2,  being only one half of the first term. So this formula is addressing two quite differential "pairs of envelopes" with quite different total amounts of (A+B). Why not clearly say] what the sources say? Gerhardvalentin (talk) 16:30, 9 November 2014 (UTC)
 * Gerhard, suppose you open your envelope and see £100. Surely A is then 100.  The other envelope must then contain either £50 or £200.   What is wrong with the proposed calculation now? Martin Hogbin (talk) 17:02, 9 November 2014 (UTC)
 * Thank you. Pleae consider that, in the standard problem, A can either be the "small amount" of (A+B)/3, then "large B" is 2(A+B)/3.
 * In my example A is £100. It is not unknown it is fixed at £100.  The other envelope may therefore contain £50 or £200.  Do you agree?Martin Hogbin (talk) 19:21, 9 November 2014 (UTC)
 * Gerhard, you are wrong. The problem is not the amounts but the probabilities, if one interprets the aim of the argument as being to calculate E(B | A = a). Richard Gill (talk) 15:40, 10 November 2014 (UTC)
 * Richard, you are still wrong. See what I told on your talk page 3 years ago, in 2011. It is chastely on probability, and in this case we have to be very careful. As certified official comptroller of Swiss Compensation Fund I know what I just am talking about. We may not be perfunctory in this case. Consider a large amount of investigation. Yes, in 50% the other envelope contains twice, and in another 50% it contains half. But if you investigate just only a set of "1/2 of ALL cases"without constriction, then within this set, again in only 50% of this set the other envelope will contain twice, but within this set the other 50% will contain half, and the result of your investigation will be faulty. This is the flaw of the 5/4-formula. Please be aware that it is not effectual to investigate in an undiscerning way just only "1/2 of ALL cases", but that it is absolutely necessary to investigate on the one hand explicitly only THAT subset of cases where the other envelope INDEED contains twice (50%, yes), and to investigate on the other hand explicitly only THAT subset of cases where the other envelope INDEED contains half, again 50%. Otherwise you stick with the 5/4-formula that is flawed for the standard version of the TEP. So please be aware of this subtle difference, of this subtle distinction. This is what we have to take into account.  Gerhardvalentin (talk) 03:02, 11 November 2014 (UTC)
 * Gerhard you are wrong. Read the literature. You are reporting OR (original research) by you, and it is flawed. It is perfectly possible that the 5/4 formula is correct for almost all values "a" of the amount in Envelope A. It is a theorem that the formula cannot be correct for *all* values. If the amount in envelope A is "a", then the 5/4 formula is correct (for the conditional expectation of B given A = a) if and only if it was a priori equally likely that the two amounts are (a/2, a) and (a, 2a). Richard Gill (talk) 16:16, 11 November 2014 (UTC)

Interdependency
I'm reminded to what I read above, Richard answered me:
 * And BTW you say: "if A actually is the larger amount (2/3 of the total amount), it cannot be doubled but can only be halved, and if A actually is the smaller amount (1/3), it cannot be halved but only be doubled". You're saying that if A is the larger amount, it's the larger amount. So what?

Two unknown envelopes, of only ONE pair of envelopes. If A is fixed at £100, then please consider that the total amount of both actual envelopes is fixed, also. Either the other envelope contains £50, then both envelopes contain £150, with A being 2(A+B)/3 in this case. This is valid for ONE pair of envelopes - as said with a total (A+B) of £150. Or it is quite another pair of envelopes, where A also is fixed at £100, where the other envelope contains £200, then both envelopes contain £300, in this case A being only (A+B)/2. In the AliBaba version, these both scenarios are equally likely. But as to the standard problem, B can only be 2A  iff A actually IS the smaller amount of (A+B)/3. Otherwise not. And as to the standard problem, B can only be A/2 iff A actually IS the larger amount of 2(A+B)/3. Otherwise not. This is mutually exclusive. Any "as well as" is only possible for two different pairs of ennvlopes.

Just to illustrate: Let's call the smaller of both amounts S, and the larger amount 2S, so the total amount T=3S. Suppose A=S (and B subsequently=2S) :).  Then the expected amount of envelope B is:

$${1 \over 2} ( 2S ) + {1 \over 2} \left({2S \over 2}\right) = {3 \over 2}S$$  = half of the total amount. So the expected value of envelope "B" will be half of the unchangeable total amount: $$ = {1 \over 2} (3S)$$. This looks rather simple, so I repeat my version of "2a=A) because it "looks more familiar" to the 5/4-formula:

$${1 \over 2} ( 2a ) + {1 \over 2} \left({A \over 2}\right) = {3 \over 4}A$$ resp. $$ = {3 \over 2}a$$  = again half of the total amount.

So the expected value of envelope "B" will be again half of the total amount.

Conclusion: A may not reach two different values at the same time within the same single formula (sin!) and, at the same time, the total amount of (A+B) may not reach two different values within the same single formula. A and B are interchangeable, but their total has to be constant, otherways weird, and only correct for AliBaba. Regards, Gerhardvalentin (talk) 20:57, 9 November 2014 (UTC)
 * Gerhard, you have not answered my simple question. If your envelope contains £100 the do you agree that the other envelope will contain either £50 or £200? Martin Hogbin (talk) 21:03, 9 November 2014 (UTC)
 * Martin, the answer is YES (but !!!). In a given pair of envelopes, if your envelope contains £100, then the other envelope in exactly 50% may contain £50, yes, BUT ONLY in those 50% where –at the same time – the actual total amount is £150. Repeat: only if the total amount actually is £150, only in THOSE 50%. Otherwise not.
 * And in a given pair of envelopes, if your envelope contains £100, then the other envelope in the other exactly 50% may contain £200, yes, BUT ONLY in those other 50% where the actual total amount is £300. Repeat: only if the total amount actually is £300, only in THOSE 50%. Otherwise not, please regard the (this) given inevitable inter-dependency, this strict coerciveness that applies to any single pair of envelopes. Any stubborn featherbrained "ignoring" of this evident interdependency can lead to post-hoc fallacies, just have a look to the literature.
 * A and B are only interchangeable.   In a given pair of envelopes, anything else is impossible. The total value of any actually given pair of envelopes is fixed and CANNOT be assimilated. It cannot be "adapted", this is utterly impossible, a complete impossibility, an oxymoron. In a given pair of envelopes, A and B are merely "interchangeable", fatto compiuto. Gerhardvalentin (talk) 21:24, 9 November 2014 (UTC)
 * Now you are getting at the heart of the problem. Gerhardvalentin, can you make a calculation of the expected return (profit or loss) without variables (a, A, S, etc.) and without prior beliefs, using the fact that one envelope contains £100? The same question goes to Richard Gill. Caramella1 (talk) 06:33, 10 November 2014 (UTC)
 * Caramellla1, thank you for your question. But there is no need to calculations, the expected return (profit or loss) clearly and evidently is zero, in the long run. Gerhardvalentin (talk) 11:57, 10 November 2014 (UTC)

$${1 \over 2} ( 2a ) + {1 \over 2} \left({A \over 2}\right) = {3 \over 4}A$$ resp. $$ = {3 \over 2} a$$  = again half of the total amount.
 * Caramell1, of course I cannot make a calculation of expected return without specifying my expectations concerning what amounts could be in the two envelopes. But if you like I can give you the formula which you can use. You plug in your prior beliefs and then calculate. It's a very simple formula. If envelope A happens to contain £100 then we only need to know our prior beliefs that the pair of envelopes contain (£50, £100) vs (£100, £200). If our prior belief concerning those two possible cases is equal, then our posterior belief will be equal too. And the expected value of switching is £125. Richard Gill (talk) 15:54, 10 November 2014 (UTC)
 * Gerhardvalentin, can you prove what you say with a formula? Remember, no variables, no prior beliefs, only £100 in one envelope.
 * I'm just back, and yes, Caramella1, anybody can. With the correct formula, be it with positive amounts only, or be it with variables. See the correct formula above (with variables "A and a" (please note that a=A/2 resp. A=2a – though Richad Gill controverts, but anyway, even M treats it that way). This "correct" formula says that the expected amount in envelope B is equal to the assumed amount in envelope A, being "half" of the total amount contained in both envelopes. Agreed? Though it isn't "necessary" that we know the positive amount contained in envelope A, let's agree that the amount in envelope A is known to be 100 (of any currency, be it £ or $ or €):

So the expected value of envelope "B" will be again half of the total amount. And with positive amounts only: $${1 \over 2} ( 2x100 ) + {1 \over 2} \left({2x100 \over 2}\right) = {3 \over 4}2x100$$ resp. $$ = {3 \over 2}x100$$  = again half of the total amount. In other words, in exactly 50%, envelope A will actually hold the smaller amount (let's call it "S") of a=1/3 of the total amount contained in both envelopes (let's call the total amount "T"). ONLY IN THIS CASE, that envelope A actually holds T/3 (i.e. if envelope A actually holds the smaller amount of a=T/3 – and NOT in "ANY" 50%, i.e. NOT in the subset of those cases where envelope A actually holds A=2T/3 !), then envelope B will hold 2a (B=2T/3). And envelope B will hold A/2 not in ANY 50%, but only in that 50%, where envelope A actually holds the large amount of A=2T/3, and NOT in the subset of those cases where envelope A actually holds only a=T/3. – We have to pay regard to this immanent interdependency, the AliBaba 5/4-formula does not pay regard to this given immanent interdependence of one single pair of envelopes. Regards, Gerhardvalentin (talk) 23:07, 10 November 2014 (UTC)
 * Gerhardvalentin, if I understood you correctly, your formula suggests that the expected value of the other envelope is £150. So your envelope (suppose that you play the game) contains with certainty £100 and you expect that the other envelope contains £150. According to this reasoning you should switch because by switching you expect to have a profit of £50. Then how is the expected return zero? Caramella1 (talk) 05:02, 11 November 2014 (UTC)
 * Caramella1, it's because you just made evident that knowing the contents of the envelope picked, or not, makes no difference. You just made it evident that knowing the coincidental contents of the envelope picked can never be any indication on what the other envelope may actually contain. Gerhardvalentin (talk) 08:15, 11 November 2014 (UTC)
 * So, you say that it is impossible to include the £100 we know that is contained in one envelope to a formula of expected return and the result to be zero? Caramella1 (talk) 12:51, 11 November 2014 (UTC)


 * Richard, by "no prior beliefs" I meant that the player who sees £100 in his/her envelope has no reason to believe that the (£50,£100) scenario is more or less probable than the (£100,£200) so (s)he assigns equal probabilities 1/2 to both. But you covered that case and you say that the expected value of switching is £125? This is more than the £100 contained in the player's envelope so according to you (s)he should switch? So the "switching argument" presented in the main article which assumes no prior beliefs has no flaw? Please explain. Caramella1 (talk) 20:11, 10 November 2014 (UTC)
 * I explain, in my paper http://www.math.leidenuniv.nl/~gill/tep.pdf (and this is nothing new - there are many many reliable sources explaining why this goes wrong). Here is one try. Yes, *if* in advance the player thinks that (£50,£100) is equally likely to (£100,£200), then *if* they would find £100 in Envelope A they would switch. But now imagine some more amounts in Envelope A: going up by powers of 2: £200, £400, £800, ... and going down by powers of 2 £50, £25, £12.5, ... if the player thinks that ... (12.5, 25) and (25, 50) and (50, 100) and (100, 200) ... are all equally likely, then they would switch in any case, hence no need to look in the envelopes. But now the player apparently puts equal probability on (100, 200) x 2^n for all integers n (positive and negative). This is an improper prior distribution. Ordinary probability calculus breaks down.
 * For any proper prior distribution, the chance that Envelope A contains the smaller amount has to depend on the amount a in the envelope; it cannot be always equal to 50%. Richard Gill (talk) 15:22, 11 November 2014 (UTC)
 * The problem as stated in the main article doesn't mention about any prior distribution that the player had in mind before the envelope opens (if it ever opens for that matter). We could talk about this variant also, but let's focus on the simplest case first where the player thinks of NOTHING in advance. The moment the envelope opens, there are only 3 amounts that could be in the envelopes: £100 which is with certainty in the player's envelope and £50 or £200 which are supposedly in the other envelope. With no prior distribution and consequently no prior beliefs we agreed that the player could assign probability 1/2 to each outcome concerning the other envelope. You say that in this case the expected value of switching is £125 and this result indicates that the player should switch. Now, if the player forgets for a moment the £100 in his/her envelope and denotes by X the smaller amount and by 2X the larger one then the expected value becomes 3X/2 and this result indicates that the player has no interest in switching. Two different approaches that I think you consider both to be correct lead to two different results. How do you explain that? Caramella1 (talk) 17:54, 11 November 2014 (UTC)
 * The player does not look in the envelope and does not think in advance.
 * If you want to know how I explain things, read my paper http://www.math.leidenuniv.nl/~gill/tep.pdf
 * Of course the problem does not mention a prior distribution. That is the whole point. The argument for switching, however, does implicitly use a prior distribution. Implicitly, if we interpret the intention of the argumenter as being to compute E(B | A = a), he or she is assuming that a priori any particular amount is equally likely as the same amount multiplied by any power of 2 (any whole number, positive or negative). This leads into contradictions. Richard Gill (talk) 10:54, 12 November 2014 (UTC)

An example for you
Let us say the envelopes must contain more then £2 and less than £1000, and you know that. One envelope is picked randomly and the next one above is added as the second. You pick an envelope and open it to see £128. Should you swap? Martin Hogbin (talk) 21:29, 9 November 2014 (UTC)
 * Sorry, I don't get it. What means "the enelopes"? Single envelopes or pairs of envelopes with "1/3:2/3" resp "2/3:1/3"? One "single" envelope is picked out of how many envelopes? Out of a pile, or out of a crowd? What allocation? What scenario are you talking about, please help. In case of an accidental crowd of 995 envelopes (£ 2, 3, 4, 5 etc.) with equal allocation, it could be well to swap. Was this your question? But this is quite another issue, I suppose? Gerhardvalentin (talk) 22:12, 9 November 2014 (UTC)
 * I am saying that the two envelopes must contain more than £2 and less than £1000. I gave an example of the way in which the sums in the two envelopes may have been chosen.


 * If you prefer we could do it this way. We pick a sum between £2 and £500 and put it in an envelope. Then we put double that sum in another envelope. The player knows how the envelopes were set up but is given one of the two envelopes randomly.


 * He the opens his envelope and sees £100. He reasons that the other envelope is equally likely to contain £50 as £200, which means that on average he will get £125 by swapping.  Is he correct in his reasoning?  Should he swap? Martin Hogbin (talk) 22:38, 9 November 2014 (UTC)
 * Thanks. Disregarding the allocation of "£2 to £500" this is similar to the TEP. His reasoning to get on average £125 is nonsense, as you know, because in trillions he will neither win nor loose. But regarding the said (actual) allocation, and his £100 being closer to the lower end, he could conclude to actually be holding "the lower amount of both envelopes, also", so he actually should swap. But this is not the standard TEP. Btw, if this game is repeated millions of times, then on average he will neither win nor loose. Regards, Gerhardvalentin (talk) 23:21, 9 November 2014 (UTC)
 * In what way is my example not the standard TEP? Martin Hogbin (talk) 13:13, 10 November 2014 (UTC)
 * Disregarding the allocation of "£2 to £500", it is similar to the standard TEP that does not announce such "known allocation". Gerhardvalentin (talk) 13:21, 10 November 2014 (UTC)
 * Gerhard, the reasoning is not wrong. He reasons that on those occasions when his envelope contains £100, half of the time the other envelope would contain £50 and half of the time the other would contain £200. If he would switch every time, he would on average receive £125. All of this reasoning is perfectly correct, as long as the premise that £50 and £200 would occur equally often (among those occasions when Envelope A contained £100) is correct. And this is not unreasonable ... if the smaller envelope is equally likely to contain ... £25, £50, £100, £200 ... then, on those occasions when Envelope A actually contains £100, Envelope B is equally likely to contain £50 as £200. Richard Gill (talk) 15:46, 10 November 2014 (UTC)
 * Richard, thank you for your comments, but I know that you know better than what you just said. Since 12:24, 29 July 2011 I said on your talk page that it is necessary to detach us from the AliBaba variant, where this kind of reasoning is correct indeed, but wrong for the standard TEP. For the standard TEP, such kind of reasoning is utterly wrong indeed. The arguments (incorrect for the standard TEP) that lead to the 5/4-formula all is correct only for AliBaba. The formula    $${1 \over 2} (2A) + {1 \over 2} \left({A \over 2}\right) = {5 \over 4}A$$    (for  any A, regardless of A being – in the end – 1/3 or 2/3 of the total amount),   addresses only AliBaba, but does not adress the standard TEP. Because "any A" means "any A", be A actually 1/3 of the total amount, or be A actually 2/3 of the total amount. For the standard TEP, this is wrong. This lapidary formula (not lapidary for AliBaba, but lapidary for the standard TEP) wrongly says that in "ANY 50%", B will be 2A, and it wrongly says that in "ANY 50%" B will be A/2. Please consider that, for the standard TEP, this is utterly WRONG. The result E(B)=5/4A ignores that in one half of "ANY 50%", where it says that B will be 2A, this is not "given", as in exactly one half of those cases B cannot be 2A, as in this half of "ANY 50%", B actually is A/2 . So you have to investigate the said predication a little bit closer: only if atually indeed B=2A, meaning that B=2T(Total amounts)/3, then B=2A. This is valid only in this case. Otherwise not. And so on. In 1/4 of cases, the 5/4-formula assumes that if B=T/3, then A=T/3, and in 1/4 of cases, the 5/4-formula assumes that if B=2T/3, then A=2T/3. This is the flaw of the 5/4-frmula, regarding the standard TEP. As in the standard TEP, B=2A only if B=2T/3, and in the standard TEP, B=A/2 only if B=T/3. Not in "ANY CASE". Never ever. Comprising two quite different "1/4 of cases" into "1/2 of cases", and comprising two quite other different "1/4 of cases" into another "1/2 of cases" is correct for AliBaba only, but a flaw as to the standard TEP. Gerhardvalentin (talk) 17:18, 10 November 2014 (UTC)
 * Gerhard, I was trying to give an example where your argument applies but you should still switch. I have departed from the standard version in that the player looks in his envelope before deciding what to do, and I have considered the possible sums in the envelopes to have a maximum, as Gardner does.  In all other respects, this is the standard TEP. Martin Hogbin (talk) 17:25, 10 November 2014 (UTC)
 * We have to detach from AliBaba. Looking in his envelope, or not, is irrelevant. In the standard TEP, there is no reason for him to swap envelopes. Gerhardvalentin (talk) 17:45, 10 November 2014 (UTC)
 * My example is not AliBaba. In that case the player picks an envelope then a coin is tossed to decide what to put into the other envelope.  The player knows that he has the original envelope.


 * In my example there are two envelopes, one containing twice the sum that is in the other, but the player is given one randomly. That is the standard TEP. Martin Hogbin (talk) 18:18, 10 November 2014 (UTC)
 * Martin pardon, please help me: Is there any unclarity? Gerhardvalentin (talk) 18:35, 10 November 2014 (UTC)
 * In what? Martin Hogbin (talk) 18:50, 10 November 2014 (UTC)
 * Can I also suggest that we move to the arguments page. Martin Hogbin (talk) 18:51, 10 November 2014 (UTC)
 * In the standard TEP (you pick one of the two envelopes at random and you do *not* get to look what is in there) there is evidently no reason to switch. Yet there is an argument which seems to compel you to switch. The problem is to show what is wrong with the argument. There are different "solutions" depending on how you interpret what the designer of the argument had in mind. There is not a unique solution, because of the Anna Karenina principle. Richard Gill (talk) 20:24, 10 November 2014 (UTC)
 * Yes, I know that. I said just above, ' I have departed from the standard version in that the player looks in his envelope before deciding what to do'. Gerhard has replied that he does not consider looking in the envelope a significant issue in this discusssion, his words were, 'Looking in his envelope, or not, is irrelevant'.  Do you agree that, apart from looking in the envelope, my setup refers to the standard TEP. Martin Hogbin (talk) 20:46, 10 November 2014 (UTC)


 * Gerhard, you are WRONG. Completely wrong. It seems you don't read the literature, and you don't understand Bayesian probability. In particular you didn't read my own paper. Richard Gill (talk) 09:34, 11 November 2014 (UTC)

Martin, again to the flawed expectation of B=5/4A, flawed for the standard TEP. With probability of exactly 50% B will be 2A (only if A actually is T/3, i.e. 1/3 of the total amount – bear in mind that you have to account for your ignorance!), and with probability of exactly 50% B will be A/2 (only if A actually is 2T/3, the larger amount of both – bear in mind that you have to account for your ignorance!). If A is T/3, then B=2T/3 (50%), and if A is 2T/3, then B=T/3 (50%). For A, this adds to T/2, and for B this adds to T/2, also. So, for the standard TEP, you have to fix A (the contents of envelope A) to be T/2, and the expected value of B=A (and not 5A/4). E(B)=A applies to whatever amount you see (or dont'see) in envelope A, be it 50 or 100 or 200. Just pay attention to the fact that you do not know whether A actually is the smaller amount or the larger one. You cannot lapidary say "with probability of 1/2, B will be 2A", knowing that this is valid only in case that A actually is T/3 indeed, and vice versa. In short: For the standard TEP, the 5/4-formula (A≠A) does not include your ignorance. Because you never can know whether actually A=T3 or A=2T/3, indeed. And in her 2009 paper Falk pays regard to the fact that it is the flaw of the 5/4-formula to ignore our ignorance. This has to be shown just in the beginning, to help understand the arising of the so called paradox. Regards, Gerhardvalentin (talk) 12:16, 13 November 2014 (UTC)


 * Gerhard, can you please improve the text in the article section now called Logical resolutions? iNic (talk) 13:20, 13 November 2014 (UTC)


 * Thanks, I will try to do so, but I'm away for some time now. Gerhardvalentin (talk) 13:55, 13 November 2014 (UTC)

Imagine that the player has no idea what are the amounts in the two envelopes. He has picked one completely at random (standard TEP). We call it "Envelope A". The player now *imagines* looking in Envelope A and seeing an amount there, let's for the sake of argument suppose that he *imagines* seeing 100 Euro there. He knows the other envelope contains 50 or 200 Euro. A priori, he believes it is equally likely that the two envelopes contained 50 and 100, as that they contained 100 and 200. So now, for him the other envelope is equally likely to contain 50 or 200. The expected quantity in the other envelope is 125 so he figures that he would switch. He gets the same conclusion, *whatever* amount he imagines being in Envelope A. So he doesn't have to look in the envelope: he knows that if he would look, he would decide to switch, whatever he saw there.

There is absolutely nothing wrong with this argument!!! But we know that switching (without looking in the envelope) is senseless. So we had better think about the assumptions which were used in the argument I have just given, a bit more carefully.

Is it reasonable to suppose that *whatever is in envelope A, the other is equally likely to contain half as to contain double that amount*?

There is no problem with the reasoning. There must therefore be a problem with the assumptions. Take the assumptions seriously and see where they lead. You will see that they lead to the nonsense assumption that the smaller amount of the two is equally likely equal to 50, to 100, to 200, to 400, ... and at the same time to 25, to 12.5, ... They lead to the assumption that the logarithm of the smaller of the two amounts of money is uniformly distributed between - infty and + infty. A so-called improper prior distribution. With such a prior distribution, expectation values make no sense. Probability calculus breaks down. The assumption was unreasonable and unrealistic, and leads to paradoxes.

Please, Gerhard, stop coming up with your own arguments, but instead, study the literature and *understand it*. Richard Gill (talk) 09:38, 11 November 2014 (UTC)


 * This discussion showcases perfectly the dichotomy found in the literature regarding how one ought to solve this problem. We as editors should not waste our time discussing the problem itself. If you want to continue to do that please do that at the Arguments page. Instead we should be happy that we have representatives of both views here as editors. I think that Gerhard can write the text for the more logical solutions of the problem, which include the "common resolution" and Smullyan's version, while Richard and Martin can write about the probabilistic/Bayesian approach. Which section is presented first in the article isn't that important. It's more important that the text is kept short with not too many mathematical formulas. The main ideas are easy to grasp and can be explained almost without any math at all. This page was much more fun to read some years ago I think. It was short and the main ideas were explained in a short and succinct way. I think it would be great if we could evolve the page to become less complex and easier to read again. iNic (talk) 12:54, 11 November 2014 (UTC)


 * I agree! Richard Gill (talk) 15:11, 11 November 2014 (UTC)

Core resolution
This is what I call the probabilist's resolution. I think it ought to go first. It corresponds to the historical main line of the development of the "exchange paradox" (TEP is just one instance). If it is hard to explain then we have to work hard trying to explain it to a wide public. Lots of people have tried.

A stands for the content of envelope A, and B for the content of envelope B. X stands for the smaller amount, Y for the larger. I treat all four A, B, X, Y as having some joint probability distribution - this might represent a consistent set of prior beliefs, or it might reflect relative frequencies in infinitely many imagined repetitions. In other words, the story which follows is quite neutral as to what meaning you give to "probability".

We know that Y = 2X > 0 and that (A, B) = (X, Y) or (Y, X) with equal probability 1/2.

Since the answers are expressed in terms of the amount in envelope A, it also seems that in step 7, the writer is trying to compute E(B | A). Contrary to what many authors on TEP imagine, this in no way implies that our player is actually looking in his envelope. The point is that he can *imagine* what his expectation value would be of the contents of Envelope B, for any particular amount "a" he might *imagine* seeing in his own Envelope A, he were to take a peek. If it would then appear favourable to switch whatever that imaginary amount might be, then he has no need to peek in his envelope at all: he can decide to switch anyway.

Notice that E(B | A) is shorthand for: compute E( B | A = a) for any possible value a which A might take. The result is a function of a. We write E(B | A) for that same function of A. It is helpful to distinguish between random variables, and possible values of random variables. The writer of "the switching argument" doesn't do this, and that's one of the reasons he gets screwed up.

The conditional expectation E(B | A = a) can be computed just as the ordinary expectation, by averaging over two situations, since when A = a, B can only equal 2a or a/2. The mathematical rule which has to be used is:

E(B | A = a) = P(B > A | A = a) E(B | B > A, A = a) + P(B < A | A =a) E(B | B < A, A = a).

In step 7 the writer correctly substitutes E(B | B > A, A = a) = 2a and similarly E(B | B < A, A = a) = a/2. But he also takes P(B > A | A = a) = 1/2 and P(B < A | A = a) =1/2, that is to say, the writer assumes that the probability that the first envelope is the smaller or the larger doesn't depend on how much is in it. But it obviously could do! For instance if the amount of money is bounded then sometimes one can tell for sure whether A contains the larger or smaller amount from knowing how much is in it.

In fact it is easy (for a competent mathematician: half a dozen have published elementary proofs of this) to show that for any proper joint probability distribution of the two amounts of money X < Y, it is *impossible* that the amount in one envelope A is independent of which envelope has the larger amount.

Well that might be hard for non-mathematicians to understand. But fortunately, many writers on educational and recreational mathematics have written pages and pages expressing the main ideas in words. Richard Gill (talk) 15:50, 29 October 2014 (UTC)

What is presently called the common resolution of the problem, is to interpret step 7 not as a computation of a conditional expectation but as computation of an unconditional expectation. E(B ) = P(B > A ) E(B | B > A) + P(B < A ) E(B | B < A). Now he gets the probabilities right but the expectations wrong. He computes E(B | A, B < A) instead of E(B | B < A). If he had done it right, he would have found of course E(B) = E(A) = 3/2 E(X). I think this is a bad resolution of the problem, since it assumes that even more stupidity on the part of the writer.

So the two main branches, two main types of solutions, depend on how sophisticated we assume the writer of the argument steps 1 to whatever, was. Given we know where the problem comes from, I think we can assume that the writer is sophisticated: he is trying to do something clever, and he makes a quite subtle mistake. The other resolution assumes that the writer is unsophisticated and is making an incredibly stupid mistake.

Maybe that works for unsophisticated people. I would still like to say, sorry, they have missed the point, possibly because they are not equipped to appreciates some subtle distinctions which are rather important when doing probability calculations. Richard Gill (talk) 18:05, 29 October 2014 (UTC)


 * Yes, it is impossible that the amount in one envelope A is independent of which envelope has the larger amount. However, it is possible that E(B|A) > A (always! that is, for all values of A) and nevertheless E(A|B) > B (always). This leads to a paradox of preference: one (valid!) argument shows that B is definitely better than A, and another (also valid!) argument shows that A is definitely better than B.
 * However, this may happen if conditional expectation is defined as expectation w.r.t. the conditional distribution (not requiring that A and B are integrable w.r.t. the unconditional distribution). Indeed, if they are integrable then we get E(A)>E(B)>E(A), - a contradiction.
 * Preference becomes meaningless if A and B are both of infinite expectation. Unlike the set theory, here one infinity cannot be more, or less, than another. Boris Tsirelson (talk) 18:10, 29 October 2014 (UTC)
 * Indeed. I write a lot about this in my paper http://www.math.leidenuniv.nl/~gill/tep.pdf when I consider later versions of TEP. Richard Gill (talk) 18:42, 29 October 2014 (UTC)
 * Ah, yes, now I see. You wrote a lot, and very apt. Boris Tsirelson (talk) 18:58, 29 October 2014 (UTC)
 * And moreover, I am sending your text to an economist; some paragraphs are very much related to our (him and me) repeated discussions. Boris Tsirelson (talk) 19:10, 29 October 2014 (UTC)
 * Boris, and please can you show this visualization to your friend, too? I tried to visualize the determination of Ruma Falk in Teaching Statistics 30 (2008). She directly addresses the Wikipedia article (A≠A.) and she contradicts the claim that the other envelope can "per se" contain 2A or A/2. As to the standard variant, this means forgetting that only 1/3 of the total amount can be doubled, and only 2/3 of the total amount can be halved. That (inter-)dependency may never be lost out of sight. Thank you, would be great. Gerhardvalentin (talk) 20:09, 29 October 2014 (UTC)
 * Thank you, but no, sorry; my remark was misleading to you, since I did not say which point of Richard's text was meant. We did not discuss "two envelopes" with the economist. The point is rather, the first paragraph of page 13. Even if in reality all distributions have compact supports, still, models with no compact support may give better predictions! This appears to be hard for the economist. I gave him a NUMERICAL example, where WRONG assumptions lead to RIGHT predictions, while RIGHT assumptions lead to WRONG predictions. He agrees that this is a fact, but feels in trouble, how to accommodate. Boris Tsirelson (talk) 20:23, 29 October 2014 (UTC)
 * Economists are like physicists: they use a lot of mathematics as a language, and are even often better at calculating things than mathematicians, but they do not distinguish between mathematical models, and the reality they are supposed to ... model (approximate, describe, in some simplified and manageable way). Richard Gill (talk) 10:35, 30 October 2014 (UTC)
 * Wow! My experience with physicists is opposite. They always corrected me, that what I consider is not a reality but a model. Gradually I got somewhat better in this aspect, but initially I was extremely naive. Also that economist always emphasizes that economic models are very far from reality (as compared to physical models). But he did not expect that an "evidently wrong" assumption can make a model better. Boris Tsirelson (talk) 11:53, 30 October 2014 (UTC)
 * Richard, I cannot let you get away with that either. This is not the right place to discuss it but it looks as though you have a serious misapprehension about physics. Martin Hogbin (talk) 14:58, 30 October 2014 (UTC)
 * Martin I speak from experience of working intensively in quantum foundations for the last 15 years or so. So maybe I got to meet a different kind of physicists from you. Richard Gill (talk) 08:19, 31 October 2014 (UTC)
 * I do not think so. I have also spent a lot of time discussing relativity (the other branch of physics that causes the same kind of issue) with experts and interested amateurs.  Many non-physicist have strange misconceptions about reality.   If you want to talk more I suggest my talk page. Martin Hogbin (talk) 09:18, 31 October 2014 (UTC)
 * Ah, the resolution of the disagreement! Martin thinks like a physicist and I think like a mathematician. Richard Gill (talk) 12:08, 1 November 2014 (UTC)
 * I do not think that there is a difference in opinion between mathematicians and physicists, except, as I think Boris is saying, many mathematicians have not though much about what they mean by '(physical) reality'. This is really a topic for elsewhere though.  We seem to agree about the TEP. Martin Hogbin (talk) 13:07, 1 November 2014 (UTC)
 * There are markedly different cultures though of course there is a (thank heavens!) some communication (overlap) between the two. Mathematicians probably are on the whole pretty ignorant of real reality. Physicists are pretty ignorant of mathematical reality. What physical reality is, is a big open problem in the foundations of physics. Most physicists have not thought about it and are not going to think about it either. Richard Gill (talk) 15:23, 1 November 2014 (UTC)
 * Richard, I agree that that resolution should go first but could not gain consensus for the change. I might have a go at writing a layman's version.
 * That would be a good idea. Richard Gill (talk) 18:42, 29 October 2014 (UTC)


 * There are two problems that I can see with the current first resolution. Firstly the non-mathematical description is so simple as to be meaningless. What kind of quantity is A?  In what way is it being two things at once?


 * Secondly its mathematical description is even more complicated that the one that you give above (perhaps you could explain it to be some time). I am also not sure how much of that description is actually contained in the cited sources and how much is your, rather generous, interpretation of what they meant to say. Martin Hogbin (talk) 18:19, 29 October 2014 (UTC)


 * The current first resolution (as presented in the philosophy literature) is problematic because it is even more un-mathematical than the original problem - and by doing so, it makes things even worse, not better, IMHO. I think it is better to give the "good" mathematical resolution first, and after that, discuss the way what I call "the philosophers" have attempted to solve the problem. Just as I have done in the talk page here. *After* having discussed a first decent resolution of the problem, one can try to report some other parts of the literature. On the one hand there is the philosopher's direction. On the other hand, there are the later economists and decision theorists variations. Also interesting for specialists. I tried to cover them all in my (not quite finished) paper, explaining how they are connected. Richard Gill (talk) 18:42, 29 October 2014 (UTC)

Martin, you say "I am also not sure how much of that description is actually contained in the cited sources and how much is your, rather generous, interpretation of what they meant to say". I was not trying to be generous. I was simply trying to make sense of a lot of words and to represent them in mathematics. If somebody else can make *different* sense of the words written in those philosophy papers, great! I did my best. Maybe I was generous. I tried to find a synthesis. Tried to make sense of all solutions. I don't want to reject any. BTW I exchanged some friendly emails with several of the philosophers. Nobody objected to what I wrote. Richard Gill (talk) 18:45, 29 October 2014 (UTC)

Thoughts on a layman's version of the above
These are just thoughts, anyone who knows that I am wrong please say so.

The problem lies in the expectation calculation. To do a proper calculation you need to add up every possible value multiplied by the probability of finding that value (then normalise). The calculation given is a short cut that only works if every possible value has the same probability of being found in the second envelope; there is no distribution (with a finite expectation) that permits this. Martin Hogbin (talk) 19:09, 29 October 2014 (UTC)


 * Close. I see you are assuming my favoured interpretation of what is going on in steps 1 through whatever. Namely, in step 7 we are after a conditional expectation. The calculation is correct if given what is in envelope A, whatever it is, it is still equally likely to be the larger or the smaller of the two amounts. If that would be true then every conceivable value would be equally likely. But there is no distribution which allows this. You cannot put equal probabilty on each of infinitely many possibilities. Each possibility would get probabilty zero. Richard Gill (talk) 19:14, 29 October 2014 (UTC)


 * I was just trying to put into words what you said above. Is the conditional expectation interpretation common in the literature?


 * Why do we need to state about whether it is the larger or smaller, surely the expectation calculation is at least suspect if the same probability does not apply to every possible sum in envelope B. Martin Hogbin (talk) 19:41, 29 October 2014 (UTC)


 * I don't know what you mean, Martin. I am doing what step 7 does, namely calculating an expectation by splitting it over two possible cases. You can do it for the unconditional expecation of B, and you can do it for the conditional expectation for B, in exactly the same way. We don't know what is in the writer's mind but he is clearly calculating *an* expectation by splitting it over the two cases: A < B and A > B. He doesn't say if he is after the (ordinary, unconditional) expectation of B, or the (conditional) expectation of B given what is (or what might be) in envelope A. The probability calculus rule looks just the same; one can even say: it *is* just the same ... a conditional expectation is "just" an ordinary expectation, but taken with respect to a conditional probability distribution.


 * The two possible rules he might be following are:


 * E(B) = E(B | A < B) P(A < B) + E(B | A > B) P(A > B)


 * E(B | A = a) = E(B | A < B, A = a) P(A < B | A = a) + E(B | A > B | A = a) P(A > B | A = a)


 * It makes much more sense, to me, that his intention was to compute the conditional expectation, i.e. to follow the second rule. And in that case he is taking those two conditional probabilities both to be equal to 1/2, whatever the value of a. Which as you know leads to a contradiction. And almost all the literature agrees with me. It is only the literature of those who have no math fluency at all, who choose the first rule. But either way, he gets it wrong - he does not apply either rule consistently. Richard Gill (talk) 05:30, 30 October 2014 (UTC)


 * I understand. Thanks. Martin Hogbin (talk) 13:38, 30 October 2014 (UTC)


 * As to "whatever the value of a", I guess we should be more accurate. I'm not talking about 1010 or infinity, nor about zero. As to the standard version, one envelope "A" having randomly been picked out of two unknown envelopes, then for any (!)value A in envelope "A", the other envelope of course is equally likely to hold the double amount or half the amount, and as said this applies to all values of A, with the only restriction of interdependence: that if A actually is the larger amount (2/3 of the total amount), it cannot be doubled but can only be halved, and if A actually is the smaller amount (1/3), it cannot be halved but only be doubled. This obviously is valid for whatever valuet of A. Gerhardvalentin (talk) 19:21, 30 October 2014 (UTC)


 * When I say "whatever value of a" I mean "whatever value of a". It may be restricted to "any possible value of a". If you know in advance that there is some lower bound or upper bound to the amount of money in an envelope,  then you can take this into account. And now you immediately see that the argument is wrong, because if the amount in envelope A is the largest possible amount of money which can be in an envelope, then you know the other envelope for sure has half that.


 * I already said that if the amount of money in an envelope is bounded, then the argument obviously breaks down. Richard Gill (talk) 20:48, 30 October 2014 (UTC)


 * And BTW you say: "if A actually is the larger amount (2/3 of the total amount), it cannot be doubled but can only be halved, and if A actually is the smaller amount (1/3), it cannot be halved but only be doubled". You're saying that if A is the larger amount, it's the larger amount. So what? Richard Gill (talk) 20:52, 30 October 2014 (UTC)


 * Thank you. The amount could have been written on a cheque, and I said I'm not talking of 1010. And of course we don't know whether A is the larger amount (let's call it A) nor whether A is the smaller amount (let's call it "a", e.g.), we just know the interdependency of the only two possible states:
 * either A is the smaller amount "a", then B is the large one (let's call it "B" e.g.),
 * or A is the larger amount "A", then B is the small one (let's call it "b", e.g.)

$${1 \over 2} (2A) + {1 \over 2} \left({A \over 2}\right) = {5 \over 4}$$ The WP-article shows this formula, based on the argument    "Thus the other envelope contains 2A with probability 1/2 and A/2 with probability 1/2",     while this argument is just only an incomplete fragment that perforce leads to driveling false conclusions, and sources say that this mistake is adding apples and oranges. Yes, I think we agree that this formula isn't apt to the standard problem. Imo the article should contrast it with sth. like   $${1 \over 2} ( 2a ) + {1 \over 2} \left({A \over 2}\right)$$. I guess that for the reader this could be clearer than Ruma Falk's version (of "S" being the smaller one of both actual amounts), where she says, regarding the interdependence of both actual amounts: $${1 \over 2} ( 2S ) + {1 \over 2} \left({2S \over 2}\right)$$,  and where she adds  "So the expected amount in the other envelope is  ½ 2S + ½ S = 3/2 S." IMHO at least Falk's interpretation should be shown in the article, in order to contrast the false 5/4-formula. Gerhardvalentin (talk) 10:04, 31 October 2014 (UTC)
 * Knowing that, we can pay regard to this fact and make provsions. We could say:
 * $${1 \over 2} ( 2a ) + {1 \over 2} \left({A \over 2}\right) = {3 \over 4}A$$ resp. $$ = {3 \over 2}a$$  =   half of the total amount. So the expected value of envelope "B" will be half of the total amount. This standard scenario must clearly be distinguished from the scenario of any "initial A" and thus "dependent B", being the only scenario where the 5/4-formula applies. Thank you, and regards Gerhardvalentin (talk) 22:04, 30 October 2014 (UTC)
 * Gerhard you are saying something easy in a complicated way. And you are confusing matters. The expected value of the contents of envelope A is 1/2 x + 1/2 y, where x and y are the actual amounts in the two envelopes, about which we know x > 0 and y = 2x. So E(A) = 3/2 x = E(B). This is true whatever x might be. So we don't have to say anything at all about what x might or might not be.
 * If we wish to add into the story a probability distribution of values of x representing our uncertainty as to what x might be, then we make things more complicated. We think of x as being the actually realized value of a random variable X. I just showed you that E(A | X = x) = 3/2 x = E(B | X = x). Which tells us E(A | X) = 3/2 X = E(B | X). Averaging over the values x of X, we get E(A) = 3/2 E(X) = E(B) . Here "E" stands for taking expectations (averaging) with respect to *both* uncertainties: what is x; which envelope has x in it.
 * I am talking about the two envelopes problem. The other scenario you talk about is not the two envelopes problem. Why talk about it here? Yes, it is different.
 * Why I just "mentioned" it?  Because the article says so:
 * Gerhard, I have told you that there are *two* correct formulas, depending on whether we are after a conditional or an unconditional expectation. Ruma Falk thought in that old paper that we are after the unconditional expectation. Most mathematicians however think we are after the conditional expectation. The problem was invented by mathematicians. But anyway, you cannot say what is right or wrong till you say what it is you want to calculate: E(B), or E(B | A = a)? You are *adding* to the confusion by not making clear what you think the writer of step 7 is trying to do. Since we do not know what he was trying to do, we can only guess, there can be different resolutions to the paradox. There are a lot of reasons why the conditional interpretation is the intended one. So you and Ruma Falk are focussing on a degenerate side-line solution, not the hard-core main-stream solution. Moreover you are adding to confusion by not using a standard unambiguous notation and not using standard probability notations. Richard Gill (talk) 11:06, 31 October 2014 (UTC)
 * So, what is wrong with $${1 \over 2} ( 2S ) + {1 \over 2} \left({2S \over 2}\right)$$, where the expected amount in the other envelope is  1/2 2S + 1/2 S = 3/2 S." Before we address all zillions of papers, we should show both formulas, just in the beginning, where it should be mentioned that the 5/4-formula is no paradox for the variant of an "initial A" and a "dependent B". Gerhardvalentin (talk) 11:39, 31 October 2014 (UTC)
 * You are talking about a different two formulas from me. I am talking about the two general formulas:


 * E(B) = E(B | A < B) P(A < B) + E(B | A > B) P(A > B)


 * E(B | A = a) = E(B | A < B, A = a) P(A < B | A = a) + E(B | A > B | A = a) P(A > B | A = a)


 * Both of my two formulas are valid for any pair of random variables A and B such that A and B are always different. Do you understand them? Do you agree with them? They *both* apply to the TEP scenario, *and* they also *both* apply to what you call the variant "initial A and a dependent B", which is not the TEP scenario. I don't see what is the point of introducing other scenarios to people, *before* you have resolved the paradox. You can do that *afterwards*, if it seems helpful. The problem is to understand what is going wrong in step 7 of the famous derivation of the obviously false conclusion that you should switch, in the standard TEP scenario. Well, either the writer of step 7 was trying to compute E(B) or he was trying to compute E(B | A = a). I happen to know he was trying to do the latter, but many people got the wrong idea, so we have a bifurcation in the literature. But it doesn't matter, because if he was going for E(B), then he got the two probabilities on the right hand side right, and the two expectations on the right hand side wrong. If he was going for E(B | A = a) he got the two expectations right, and the two probabilities wrong. So all wikipedia has to do, is to explain these two formulas, and do the calculations properly for the two alternatives, and show that step 7 is wrong, because it if mixing up the two together. The author of step 7 is mixing conditional and unconditional probabilities and expectations, and one of the reasons he does that, is because he is using inadequate notation. So it is important to start by introducing *adequate* notation. You can't resolve the paradox is you, too, use an inadequate notation, namely a notation which does not distinguish clearly between things which need to be distinguished. Richard Gill (talk) 20:07, 31 October 2014 (UTC)


 * More simply still: A + B is the total amount and E(A + B) = E(A) + E(B) is the expectation value of the total amount. By symmetry, which is the key property of TEP and other exchange paradoxes, E(A) = E(B). So E(A) is half the expectation of the total amount. We know in advance this symmetry, that is why we know in advance that there is no point in switching. Richard Gill (talk) 08:23, 31 October 2014 (UTC)
 * You say we know this symmetry in advance, and "that is why we know in advance that there is no point in switching." So why not showing this symmetry just in the beginning, in contrast to the 5/4-formula that is shown there (without mentioning that it clearly depicts the variant of an "initial A" and a "dependent B"). Gerhardvalentin (talk) 11:55, 31 October 2014 (UTC)
 * I was going to start a new section on that topic below because it is, in my opinion, a key fact in reducing the chaos in the literature and in the article. The TEP is a self-inflicted injury in that there is no problem until we make one by presenting a bogus line of argument for switching. Martin Hogbin (talk) 09:28, 31 October 2014 (UTC)
 * Well - it was an injury deliberately inflicted by professional mathematicians on amateur mathematicians. Deliberately. You could also call it "an in joke". The argument for switching must be wrong. The problem is to figure out where the argument fails. There is not a unique solution, as I explain in my paper. Anna Karenina principle. However some solutions are more persuasive than others. We have to remember that the people who invented the problem were not stupid at all. They also expected some sophistication (mathematical ability?) on the part of their readers. Richard Gill (talk) 10:27, 31 October 2014 (UTC)

The 'common' resolution
Richard, I can see how E(B) = E(B | A < B) P(A < B) + E(B | A > B) P(A > B) gives the correct answer but I cannot see how it relates to what is called the 'common resolution' in the article, or to A being two things at the same time. The Necktie paradox article has not helped me here. Martin Hogbin (talk) 09:28, 31 October 2014 (UTC)
 * Well, have you studied my paper yet? Have you tried to read the philosophy papers? Richard Gill (talk) 10:28, 31 October 2014 (UTC)
 * Anyway, it doesn't matter. I don't think that this is the main line of solution. It should not even be called "common resolution" because it isn't. Richard Gill (talk) 10:29, 31 October 2014 (UTC)
 * One other problem that I see with this is that E(B) = E(A), which pretty much destroys the paradox before it starts. Martin Hogbin (talk) 01:05, 3 November 2014 (UTC)
 * Of course E(B) = E(A), that's the symmetry which we know in advance! BTW, the editors of Hellenic Mathematical Society have given me the email address of Tsikogiannopoulos Panayiotis who I have asked to send me a copy of his paper. Richard Gill (talk) 06:58, 3 November 2014 (UTC)
 * All the maths here is totally elementary. As soon as you start doing the math properly the paradox (at least: the initial version: to explain what is wrong with the reasoning) evaporates. The author has bad notation which mixes up random variables and values of random variables and gets in a mixup doing an elementary calculation. This is what happens to every student of probability 101 (for non-mathematicians) who came from high school without being able to distinguish small letters and capital letters. Unfortunately we see them quite often nowadays. Richard Gill (talk) 07:02, 3 November 2014 (UTC)
 * That was my point! Like you, I find it very hard to believe that the proposed line of reasoning for swapping was intended to be a calculation of E(B).  Martin Hogbin (talk) 12:40, 3 November 2014 (UTC)
 * Good. Still, it is how a lot of the philosophy literature seems to interpret the "thinking" behind the written argument. So this interpretation needs to be covered in WP too. But I would suggest not to make it the primary solution. I would suggest that if you want to start the article with a clear snappy *verbal* solution then simply transpose the Two Necktie verbal explanation to Two Envelopes. After that say that for a more detailed analysis some mathematical notation and concepts are required. Then do the E(B | A = a) calculation and explain what might be going on wrong here. Namely: the author perhaps is confusing the probability envelope A contains the smaller amount, with the conditional probability that it contains the smaller amount, given that it actually contains "a" dollars (whatever "a" might be). On the other hand, perhaps the author seriously intends us to believe that whatever "a" might be, that latter conditional probability is always 1/2. Well that implies that an infinite number of amounts can possibly be in the envelope (double as many times as you like, or halve as many times as you like, any amount which you do consider plausible) and all of them equally likely. This is bringing you up against paradoxes of infinity. It becomes infinitely more likely that envelope A contains any amount *outside* of any finite interval, than any amount *inside* that interval. Well the writer can retort, I don't mean *exactly* equal to a half and I don't mean *all* values "a*. Well then the analyst responds, still if you believe this chance is close to 1/2 for a whole lot of possible values of "a" then your distribution of possible values spreads out over a very wide range spanning many orders of magnitude, the logarithm of the amount is close to uniformly distributed over a large range. This means that it has a "close to infinite" expectation value and in that case, expectation values are not a good guide to decision making. As Keynes said, in the long run we are all dead. The expectation value is the long run average. If an expectation value is infinite then the short run average is never anywhere close to the long run average! Richard Gill (talk) 14:21, 3 November 2014 (UTC)

How about this line of attack for the E(B | A = a) case?
1) Start with the case where there are two envelopes containing £10 and £20 and the player knows this.

The player chooses an envelope at random and, as in the original versions, is allowed to look in his envelope before deciding. Before he looks he concludes that probability of there being double the money in the other envelope is 1/2 and that the probability of there being half the amount is also 1/2

Then: he looks and sees £10. Swap

Or: he looks and sees £20. Don't swap.

This is obvious to everyone

2) The player chooses an envelope at random but is not allowed to look inside before deciding.

The player now knows that the although the probability of their being double the money in the other envelope is 1/2 and that the probability of their being half the amount is also 1/2, that probability is not independent of the sum in his envelope. He therefore knows that the proposed calculation is wrong as the expected value in the other envelope depends on what he has in his envelope.

3) Extend the case to many envelopes in which the sum is bounded.

Is that OR? Martin Hogbin (talk) 19:40, 3 November 2014 (UTC)


 * Your new scenario departs from original TEP in two extremely important respects.
 * Firstly, in the original / canonical version of TEP the player does *not* look in the envelope before choosing. He imagines looking in the envelope, and his little argument leads him to want to swap, whatever amount he imagines there. I know that some later authors made the same wrong assumption. So the literature is a mess. But lets try to reduce mess levels at the begining of the article.
 * Which version is the original? In Nalebuff and Zabell the player looks in his own envelope.
 * Martin Gardner told us the problem long before Nalebuff and Zabell. Richard Gill (talk) 19:16, 5 November 2014 (UTC)
 * Thanks. Martin Hogbin (talk) 10:48, 8 November 2014 (UTC)
 * Nalebuff says explicitly that the switching argument does not require one to look in the envelope. You only need to imagine looking in it. Richard Gill (talk) 17:26, 16 November 2014 (UTC)


 * Secondly, the point of TEP is that we are *not* given any information at all as to what amounts can be in the envelope. The reasoning given, does not make any such assumptions. It is an attempt to argue without that information. TEP is about reasoning under uncertainty, and one of the two uncertainties in the problem, is what the amounts in the envelopes might be. (The other uncertainty is which envelope has the smaller amount). By starting off giving such hard information, you are departing very far from TEP. I see that there could be pedagogical reasons for this. But you will have to make it very very clear that your new little problem is *not* TEP.
 * I do agree that my starting point is not the TEP but the idea is to show how E(B | A = a) is not always 1/2. This example makes that very obvious.  It is then possible to move on to the case where the player does not know what might be in the two envelopes but may have a belief about what might be in them.


 * I would say that what you write is such basic common sense that it would be weird to call it "own research" but on the other hand it seems that you want to do more than report what is in the literature. You have pedagogical aims on top of that, and in order to further them, you start by describing a scenario which departs from the canonical TEP scenario in two essential ways. I would not object but perhaps other editors would. Richard Gill (talk) 07:06, 4 November 2014 (UTC)
 * You may well be right. I always make the point that WP is not a literature review but should explain things. Unfortunately the literature only explains things properly in mathematical language that most people are not familiar with.  The simpler explanations are mainly wrong or do not answer the question asked.
 * In my understanding, Wikipedia is an encyclopedia which summarizes "known" knowledge. Some parts of known knowledge only make sense to people who are already in the possession of other known knowledge. If something technical has never been explained to a broad amateur audience before, then it is not Wikipedia's job to be the first. You can try, but people who disagree with you will shout "OR". But you can still try ... Richard Gill (talk) 19:23, 5 November 2014 (UTC)
 * I am not going to stick my neck out in this.

at the same time, the actual amount A can be 2T/3 (meaning B small =A large /2 and T small =3A large /2). This is mutually exclusive. We have to show that these are two quite different scenarios. Either scenario 1) of T large =3A small [e.g. A=20, B=40, T=60], but not the other one of T small =3A large /2). [e.g. A=20, B=10, T=30] Or the other scenario, but in that case never scenario 1). Both scenarios are equally likely, but those two different scenarios are mutually exclusive. Either/Or, but never "as well as", please have a look to this diagram. Gerhardvalentin (talk) 15:42, 8 November 2014 (UTC)
 * My real complaint about this article is the simple explanation to what is called here the common resolution. This simple explantion makes no no sense at all in ordinary language.  Intuitively A stands for whatever is in the original envelope, this is not two different things.
 * But yes, this IS two different things, because A is not "independent". Not independent, because the actual total amount T contained in both envelopes (A+B), as well as the actual amount of A are interdependent.  As to the article, we must use sources to clarify that the actual amount A impossibly can be T/3 (meaning that B large =2A small and T large =3A small ) and,
 * Gerhard can I start by asking you what kind of quantity you are taking A to be? Intuitivly I think most people would take it to be either an algebraic variable or a constant.  It is intended by the proposer to represent the sum of money that you have in your hand.


 * If we consider the variant where you can look at the sum that you have then A can be a constant; if you look and see £100 then A is clearly fixed at 100 and cannot be two things, so the argument fails. Martin Hogbin (talk) 14:52, 9 November 2014 (UTC)


 * I we want to mathematically challenge the proposed line of reasoning for swapping we must first, as you have done, say what kind of quantity A is. Once that is done the relevant mathematical rules tell us exactly what we can and cannot do with A, and how to do it.


 * The current explanation is neither fish nor fowl, it is neither natural and intuitive nor mathematical. Martin Hogbin (talk) 10:47, 8 November 2014 (UTC)


 * Why not start at the beginning: with the two necktie resolution? (transcribed to two envelopes, of course). Richard Gill (talk) 07:13, 4 November 2014 (UTC)


 * That is what I did do. There is a table at the start of the Necktie paradox in which the ties can only have two possible values. In that table there are two cases where that two ties have equal value, which is not allowed in the TEP, so I have removed them.  That leaves pretty well what I suggested.  Although the necktie article does not specifically say that each owner knows the value of his own tie it starts the explanation by using numerical values. Martin Hogbin (talk) 09:51, 4 November 2014 (UTC)

PS Whether amounts are bounded or not is not important at all in order to resolve the paradox (pinpoint a fault in the reasoning). It comes up later in mathematical variations of the problem. In the sequels (cf. my movie analogy: Aliens, Aliens 2, Aliens 3 ...; Aliens 0). Richard Gill (talk) 07:09, 4 November 2014 (UTC)


 * Is it not true that, for the original problem in which one envelope has double the amount in the other, there are only two possibilities: that E(B | A = a) is not always equal to 1/2 or both envelopes have an infinite expectation? Martin Hogbin (talk) 09:56, 4 November 2014 (UTC)
 * In the original problem, E(B | A=a) cannot always equal 1/2. (I assume proper probability distributions)). E(A) = infty iff E(X)= ≈infty. The answer tomyour question is "no". Richard Gill (talk) 19:16, 5 November 2014 (UTC)
 * Sorry Richard, my question was not clear so I do not understand your answer. Is there a case where   E(B | A=a) is always equal 1/2 and E(B) is not infinite? Martin Hogbin (talk) 10:47, 8 November 2014 (UTC)
 * Are you talking about probabilities or expectations? Do you mean, is there a case where Prob(B > A | A = a) is always equal to 1/2? Answer: no. Richard Gill (talk) 15:58, 10 November 2014 (UTC)
 * Yes, that was careless of me, that is exactly what I meant. Thanks for your answer.


 * In that case, for the purposes of explaining the resolution of the problem it does seem to make sense to me to split the problem into two cases, bounded and unbounded. (I am only considering the original version where one envelope has twice the value of the other).  In one case the expectation for both envelopes is infinite and in the other P(B > A | A=a) is not always 1/2.  It must be possible to explain that in relatively simple language still based on what the sources say. Martin Hogbin (talk) 16:59, 10 November 2014 (UTC)
 * Supposing that P(B > A | A=a) is always 1/2 is equivalent to supposing that all values of the smaller amount x of the form $$2^n x_0$$ are equally likely. Here n is any whole number (positive or zero of negative) and $$x_0 > 0$$ is just some arbitrary (possible) amount. There are infinitely many equally possible values, both arbitrarily large ones and arbitrarily small ones. There is no well defined expectation value at all. It is not correct to say that this assumption forces the expectation value to be infinite. Given any finite range of values e.g. from 1 cent to 1 billion dollars there is infinitely larger probability, both that the amount is smaller than 1 cent, and that it is  larger than 1 billion dollars. However, it is possible that P(B > A | A=a) is always 1/2 except at two finite "end points" as in my numerical example with the powers of 2. Then the 25% gain on switching is true for every amount in the Envelope except for the smallest amount (when the gain is obviously 100%) and for the largest amount (when it is a 59% loss). The huge loss at this largest amount exactly compensates all the possible smaller losses at all smaller amounts. Richard Gill (talk) 17:24, 16 November 2014 (UTC)

The paper by Ishikawa
Richard, I've just read the paper by Ishikawa, (The two envelopes paradox in non-Bayesian and Bayesian statistics, arXiv:1408.4916 ].) What it basically says is that the formalism of quantum mechanics ultimately is a better tool for dealing with probability problems than probability theory itself. I think that's a novel and quite interesting idea. What are your thoughts about this? (I'm asking just out of sheer curiosity what you think, I'm not suggesting that this paper should be mentioned in the article.) iNic (talk) 01:56, 15 November 2014 (UTC)


 * Thank you very much for drawing this to my attention! The paper does not quite say this. BTW, the author has also written similar papers on the Monty Hall problem and on the three prisoners paradox.
 * Ishikawa says that quantum language does the job because it has words whereas *statistics* does not. Let me quote from his concluding paragraph:
 * Quantum language has visible key-words: ”measurement”, ”observable”, ”state”, ”measured value”. And these concepts are motivated by quantum mechanics.
 * On the other hand, statistics has invisible key-words: ”probability space”, ”random variable”, ”parameter”.
 * I think that mathematical probability theory does the job just as well as quantum measurement theory. The key-words are not invisible. And it we apply the mathematical formalization of probability to reasoning under uncertainty (Bayesian probability) we have a language at our disposal in which no key-words are missing.
 * Quantum language says that, if Problem 1 (standard description of set-up of TEP ending in question: should you switch - RDG) is a scientific statement, Problem 1 should be essentially the same as Problem 1′ (description of TEP in which many many repetitions of the filling of the envelopes are described and you are asked whether always switching envelopes would increase the long run average content of what you got - RDG). If the reader wants to assert that these are different, he has to propose another language (except quantum) by which Problem 1 and Problem 1′ are described as the different problems. That is because we believe Wittgenstein’s words (i.e., the spirit of the philosophy of language):”The limits of my language mean the limits of my world.”
 * I think that Bayesian probability theory does the job just as well as quantum measurement theory. I agree with Ishikawa that the paradox arises because of the poverty of ordinary language. In ordinary language we do not distinguish things from our knowledge of things. A thing itself can be fixed and unknown while our knowledge of it can change. And the history of the two envelopes problem shows that it arose from the struggle in Bayesian probability theory concerning how to represent complete lack of knowledge - Laplace's principle of insufficient reason - this leads naturally to improper priors but improper priors lead to paradoxes.
 * Resolving TEP by expressing it in the language of quantum measurement theory is overkill. Quantum measurement theory has concepts "preparation", "measurement" which are part of an operationalist description of physical experiments, no restriction to quantum experiments. Ishikawa's very long paper makes no intrinsic use whatever of *quantum* features of quantum measurement theory.
 * Notice that he resolves the problem by showing that in his quantum measurement language, the problem is the same as another formulation of TEP in which it is made clear by addition of many extra words that there is no point in switching. So he does not show where the reasoning in the switching argument breaks down! He does not actually "resolve" TEP at all, in the sense which I think we agree the word "resolve" ought to have in this context. Richard Gill (talk) 06:24, 15 November 2014 (UTC)
 * PS Ishikawa says explicitly that he only needs classical measurement theory, not quantum. He says: "2.3 The preparation of CMT. For the general theory of measurement theory, see refs. [3]- [14]. In order to read this paper, it suffices to know the following. Let Ω be a locally compact space. Define ...". A whole page of heavy mathematical formalism follows.
 * I would say that it suffices not to read the paper. Richard Gill (talk) 06:48, 15 November 2014 (UTC)


 * Oddly enough I was thinking something along the lines (as I understand them) of Ishikawa yesterday. The reason that QM seems weird to use and classical mechanics does not is that we make unverifiable assumptions about classical mechanics we presume that we should know the results of experiments we have not done.
 * Neither quantum nor classical mechanics "presume" that we should know the results of experiments we have not done. Classical mechanics allows us to believe that the results of experiments we have not done do exist in at least a mathematical sense - they are determined by initial conditions. Quantum mechanics tells us that the assumption of mere mathematical existence of the results of not-done experiments leads to a conflict with locality. And BTW Einstein (in the famous EPR paper) used the predictions of quantum mechanics itself, together with locality, to argue for the mathematical existence of the results of not-done experiments. Richard Gill (talk) 16:40, 15 November 2014 (UTC)
 * It was only a passing thought and I was thinking of the way in which results of experiments we have not done do exist in a mathematical sense. My point was that that was the only sense in which they exist. Intil we make a measurement, even in classical mechanics, there is no physics. In other words, we could use the language of QM in CM and not lose anything. Martin Hogbin (talk) 19:25, 15 November 2014 (UTC)


 * I do not think the paper will be much help for our article though. Martin Hogbin (talk) 13:03, 15 November 2014 (UTC)


 * I found this paper to be very interesting and it is so far the closest to my own views on how to solve this problem. iNic (talk) 03:55, 20 November 2014 (UTC)

I agree that the first problem with TEP is language. If you don't have the words or concepts you cannot resolve the paradox. I noticed that Caramella1 said a few times that since there is no specification of your knowledge about what can be in the two envelopes, the paradox has to be resolved without this knowledge. You could say that the philosopher's solution is a solution which does not use this knowledge since the unconditional expectation calculation can be done without using any prior distribution over (say) the smaller amount x. The only randomness which needs be involved in the problem is the random choice of whether envelope A contains x or 2 x. Of course in this case, the conditional probability distributions of A and B given that A < B or given that A > B are kind of boring, they are degenerate. Probability 1 on a single point (x or 2 x as the case may be). The amount x is just a constant (unknown) amount in the whole story. But TEP was invented, and comes from a long line of TEP precursors, by people who took for granted that knowledge about unknown (fixed) quantities like x is expressed in probability distributions, that probability is indeed epistemological; and moreover that lack of knowledge should be represented with uniform probability distributions. These ideas go back 200 or more years. In other words: people who were never exposed to "subjective" probability will not even recognise that TEP can be put into a subjective probability framework, and since they don't know the language and don't have the concepts, they will find the reasoning in the simplest "Bayesian" resolution completely alien. This is how the bifurcation took place, this is how the philosopher's literature got started. Richard Gill (talk) 16:51, 15 November 2014 (UTC)
 * Notice again: in the philosopher's interpretation, the switching argument is making *two* mistakes, and both of them are really stupid. In the mathematician's interpretation, the switching argument is only making one, rather subtle, mistake. Together with its ancestry, I think this is a good argument for making sure that the mathematician's interpretation is prominent, and that a first resolution under that interpretation is made as accessible as possible. Richard Gill (talk) 17:22, 15 November 2014 (UTC)
 * Also problem statements the literature are full of wording about the sum in your envelope, nothing about the expected sum. There is also evidence that Gardner had the E(B|A=a) solution in mind because he writes, 'Does this player arise because each player wrongly assumes that his chances of winning or losing are equal?'.
 * The introduction to the article already has phrases like "The expected value calculation for how much money is in the other envelope would be the amount in the first scenario times the probability of the first scenario plus the amount in the second scenario times the probability of the second scenario". The word expectation is also in the switching argument. S&D talk from their first pages about expected return, expectation formula. The E(B|A=a) solution is also a solution involving the concept expectation. There is no point at all in suddenly switching from "expectation" to "average". Richard Gill (talk) 10:38, 16 November 2014 (UTC)
 * Or we should change the article. I agree that expectation is a better word but the vast majority of people will not understand what it means.  Alternativley, we need to give some description of what 'expected value' means early on in the article.  I do not think a link would be good enough.  The natural understanding of this term will be, 'the sum you think is most likely to be in the envelope'.  Nothing in the article will make any sense if the reader has this in mind.  Martin Hogbin (talk) 11:17, 16 November 2014 (UTC)
 * OK so add the word "average" in brackets after the first occurrences of "expectation" and link all occurences of the word to the article Expected value. Richard Gill (talk) 16:43, 16 November 2014 (UTC)


 * What do you think of my proposed wording for the E(B) version? Maybe we could have a simple example like the one you gave. Martin Hogbin (talk) 19:25, 15 November 2014 (UTC)


 * I think it is not bad. As I said above, there are some words I would change. For instance, a formula is not a calculation.


 * Why don't you "be bold" and put it in the article? I agree that the present text is close to meaningless. Your text is a great improvement. Richard Gill (talk) 07:42, 16 November 2014 (UTC)


 * Or put your proposal on the talk page, first. And most important of all: give references to the reliable sources on which your text is based. Richard Gill (talk) 10:28, 16 November 2014 (UTC)


 * Now I have been bold, and made a lot of changes to the first resolution. I hope it makes sense now, and I hope it makes sense both to philosophers and to mathematicians (and to amateurs). Richard Gill (talk) 17:34, 16 November 2014 (UTC)


 * I have now similarly added an as simple as possible second (mathematical / Bayesian) solution. Richard Gill (talk) 07:29, 20 November 2014 (UTC)

The philosopher's solution
Richard I have been reading Schwitzgebel and Dever and I do not think what they say matches what you say in your paper.

S&D, in their problem statement, refer to X as the amount of money in envelope A, the originally chosen one.

In their resolution on page 5 they refer to the expected value of X, in other words the expected value of your original envelope. This means to me that they are saying that the proposed calculation is:

E(B) = 1/2 ( 2 * E(A) + 1/2 * E(A) )

They then go on to point out that E(A) is not the same thing in the two cases. Again they refer to what you would expect in envelope A.

The correct calculation should be E(B) = 2* E(A | B >A) Prob(B > A) + 1/2 E(B | B A) Prob(B > A) + E(B | B A) Prob(B > A)   +  E(B | B A) Prob(B > A) + 1/2  E(B | B  A) Prob(B > A) + E(B | B < A) Prob(B < A). They know this rule.


 * Everyone is agreed that the two probabilities here are both equal to 1/2.


 * Next, since B = 2A when B > A and B = A/2 when B < A, the right hand side can be further developed as E(2 A | B >A) Prob(B > A) + E(A / 2 | B < A) Prob(B < A). S&D know this too.


 * Next, by linearity of expectation value, and also substituting the probabilities (both equal to 1/2) we end up with the *true* statement E(B) = E(A | B > A) + E(A | B < A)/4. S&D know this too.


 * So what is "wrong" (according to them) is that E(A | B > A) and E(A | B < A) have both been replaced by A (without expectation and without conditioning), which are *two* errors, not one. They do refer to both errors and they do correct both of them. I have communicated with them by email about all this and they don't disagree with my representation of their argument.


 * They claim that they are the first to show "what went wrong", but Falk (2008) probably beat them getting this particular resolution of this particular interpretation into peer-reviewed print. As far as I know, the mathematician's resolution of the mathematician's interpretation has been known for ages. Among mathematicians it is so trivial and well understood that people do not get academic credit by publishing solutions. Unfortunately, the philosophers and the amateurs do not understand the mathematicians. Martin Gardner apparently didn't understand the mathematical literature on TEP.


 * The second purpose of Schwitzgebel and Dever is to go on and add some (in their opinion) novel facts about expectations and linearity, but (AFAIK) nobody has done anything with this contribution and I have difficulty figuring out what they mean. Richard Gill (talk) 15:57, 13 November 2014 (UTC)


 * I follow that but would it not be easier to start with:


 * E(B) = E(A | B > A) + E(A | B < A)/4


 * in order to explain the error. In the WP version, A is described as 'the amount in my selected envelope' so it seems that some formula envolving the sume in the first envelope is being proposed.


 * As you say above, there are then two mistakes that cause the wrong answer. Martin Hogbin (talk) 19:47, 13 November 2014 (UTC)


 * How can you *start* with E(B) = E(A | B > A) + E(A | B < A)/4 ? It is a strange formula and it needs to be derived. By the way we also have E(A) = E(A | B > A) + E(A | B < A)/4, since E(B) = E(A). But it remains just as mysterious.


 * I was hoping we might be able to find a way to do that between us.


 * From the point of view of making sense of the philosphers' solution that is a much better place to start.


 * In the article, step 7 says:


 * 'So the expected value of the money in the other envelope is:'


 * $${1 \over 2} (2A) + {1 \over 2} \left({A \over 2}\right) = {5 \over 4}A$$


 * Later on the first resolution says:


 * 'A common way to resolve the paradox, both in popular literature and part of the academic literature, is to observe that A stands for different things at different places in the expected value calculation'.


 * We can now make ir clear exactly what the two things that A stands for are and how they are different. The first A should be E(A | B > A) and the second one E(A | B < A). The two different A's are both conditional expectations of the value in the original envelope (which is what A purports to be), but with different conditions.


 * Surely this could make a simple, understandable, and mathematically sound explantion of what the philosphers are talking about. I do agree with you that the originators of the problem probably did not intend to suggest that A was an expectation.


 * One way to justify starting with what I suggested would be just to cite S&D they pretty much give that formula in words on page 5. Alternatively we could ourselves give the formula in words.  This is for the simple philosophers' explanation of what went wrong'. The mathematical details can follow, as we have now. Martin Hogbin (talk) 12:53, 14 November 2014 (UTC)


 * I agree. And S&D agree with me (personal communication). And you can even cite my paper, if you find my self-published unfinished report a "reliable source". That's not for me to judge, but I am happy to provide you with some good arguments based on wikipedia's own definition of this term.
 * I think you will find that Falk (2008) also agrees, and that Tsikogiannopoulos (2012) also agrees. Hopefully, iNic and Caramella1 and Gerhard Valentin will agree, too. This explanation based on this interpretation is the common core in a whole host of reliable sources. Richard Gill (talk) 12:32, 15 November 2014 (UTC)


 * Anyway, above I have shown how to derive E(B) = E(A | B > A) + E(A | B < A)/4, in a way which seems very close to the steps of the switching argument. So: you could start by deriving this correct formula, and then you can go back and look at the switching argument and compare formulas and compare the two arguments, and then you can say, (assuming that the switching argument is based on a computation of E(B)), ah ha, now we can see the *two* mistakes in the argument! That is exactly what we have done in this conversation. I would like to see it as one of the two first solutions in the article. However some people "switch off" when they see a formula so would like to have the whole thing replaced just be words. Well, it becomes quite a lot of words then, and the concepts being manipulated are not familiar to everyone, anyway.


 * Intuitively, one can say that given that B > A, the amount A tends to be small: to be precise, it is then one third of the total. Given that B < A, the amount A tends to be rather larger: to be precise, it is then 2/3 of the total. All of one third plus a quarter of two thirds equals one third plus one sixth, and that sum equals one half: so altogether, and given the total in the two envelopes, the expectation of B is half of the whole. Which we knew already. But anyway, maybe this calculation makes the numbers 1 and 1/4 less mysterious. Richard Gill (talk) 06:12, 14 November 2014 (UTC)

Proposed introductory wording for the logical/philosophical resolution section
One solution is to assume that the 'A' in step 7 is intended to be the average sum in envelope A [S&D say 'expected' but this terminology is not widely understood. Is there any objection to using 'average'?] and that we are intended to calculate the average value in envelope B. It is pointed out that the 'A' in the first part of the calculation is the average sum, given that envelope A contains less than envelope B, but the 'A', in the second part of the calculation is the average sum in A, given that envelope A contains less than envelope B. The same symbol is used with two different meanings in both parts of the same calculation but is assumed to have the same value in both cases.

A correct calculation of the average sum in ( E(B) ) would be

E(B) = E(B) = E(A | B > A) + E(A | B < A)/4,

where: E(A | B > A) is the average sum in A, given that envelope A contains less than envelope B

and: E(A|B<A) is the average sum in A, given that envelope A contains more than envelope B

This is pretty much straight out of S&D. Martin Hogbin (talk) 13:25, 15 November 2014 (UTC)


 * You use the word "calculation" for a formula. But the formula is not a calculation. You don't even know how to compute the terms in the formula! I would use the word "calculation" for the steps which led to the formula; what some people would also called its derivation. You could say "a correct calculation actually leads to ...".
 * Yes OK.
 * But I think that the *correct derivation* should be done step by step. Here you just report a final result of a failed attempt to calculate E(B) in a rather indirect way. That doesn't help anyone. You don't explain the steps which led to the "good" (but useless) result E(B) = E(A | B > A) + E(A | B < A)/4, and you don't discuss the result.
 * Why do I need to explain how I got E(B) = E(A | B > A) + E(A | B < A)/4 here? I have a reliable source for it. It is what S&D say in words, for example, '...the expected value of X in the "2X" part of the formula (where envelope A is the envelope with less) ...'.  You can explain all the details later.Martin Hogbin (talk) 19:54, 15 November 2014 (UTC)
 * I don't see what is the point of replacing the word "expectation" by "average". The word "expectation" is already used in the argument for switching. S&D use it too. If you replace the word "expectation" by average, then what are you averaging over? Are you only averaging over the two possible configurations A > B and A < B or are you also averaging over possible amounts x, 2 x?
 * The point is that 95% of readers will not understand what is meant by 'expectation'. The natural language meaning of the term is along the lines of, 'I expect there will be £20 in the envelope (this time)'.  Average is a well known and roughly understood term.  We could try to use things like 'on average' or 'over the long term' maybe. Martin Hogbin (talk) 19:54, 15 November 2014 (UTC)
 * The same applies to "expectation" and of course the main problem is that the switching argument never explains what it takes as random and what it takes as fixed, and hence never actually says what it means by expectation at all. By replacing one undefined word by another, you do not make progress. Richard Gill (talk) 17:08, 15 November 2014 (UTC)
 * Yes, I was just about to ask you about that very point. I have assumed that it is over all possible amounts but I agree that it is another weakness in the philosophers' argument.
 * I am not trying to make progress in the sense that you might, I am trying to give the average non-mathematical reader some idea of what the phlosophers' resolution is. If you think it would be better to remove all mathematical notation I could try to do that and say it all in words.  Have you read the current text; it means nothing at all! Martin Hogbin (talk) 19:54, 15 November 2014 (UTC)


 * Why say E(B) = E(B) ? You could say E(A) = E(B) = E(A | B > A) + E(A | B < A)/4 but now I think it is really incumbent on you to show that the complicated, correct, and completely useless formula E(A | B > A) + E(A | B < A)/4 can be quickly reduced to E(A). Answer: suppose the amounts in the two envelopes (fixed but unknown) are x and 2x. Then E(A | B > A) + E(A | B < A)/4 = x + 2x / 4 = 3 x/ 2, but also, directly, E(A) = 1/2 x + 1/2 2x = 3x/2. Richard Gill (talk) 17:15, 15 November 2014 (UTC)
 * No, I do not want to criticise this solution here, just give some idea what it is. I am trying to give our non-mathematical readers some idea what is meant when philosophers say 'A' stands for two different things.  It is far from obvious that A is intended to be an expectation. Martin Hogbin (talk) 19:54, 15 November 2014 (UTC)
 * OK, I see your point. Indeed, since a bare "A", without any averaging, is on the right hand side, this seems to me to make the philosopher's interpretation quite ludicrous. Clearly if the writer has got a bare "A" on the right hand side of the equation he must have been calculating the expectation conditional on the value of A. However, this is an alien concept to many readers, including, it seems, most philosophers. Moreover I have noticed that many amateurs simply cannot grasp that one can mathematically talk about E(B | A = a) without *seeing* a and without *fixing* a. Richard Gill (talk) 07:40, 16 November 2014 (UTC)

There is a big difference between "average" and "expectation." Expectation is a meaningful concept even in a single case according to some interpretations of probability and is thus not connected to averages at all. This is the most central issue about this problem/paradox. Anyone that thinks that this problem is a purely mathematical problem hasn't understood the problem at all. iNic (talk) 10:58, 20 November 2014 (UTC)

Another idea
On reflection, perhaps it would be better to assume the expectation is based on just two envelopes containing x and y=2x. This seems to produce something much more like the philosophers solution. Now E(A | B > A) = x and E(A | B < A) = y = 2x. This is what you have done in your mathematical details section. Martin Hogbin (talk) 11:10, 16 November 2014 (UTC)


 * Of course. I have revised the article accordingly. The advantage of the philosopher's solution is that there is no averaging over possible values of x and no conditioning on (no subsetting over) possible values of A. It is technically much much more simple. On the other hand, it is simply not the case that this was what the original TEP was about. It always was a paradox in Bayesian (subjective) probability connected to many other famous paradoxes of improper prior distributions. Apparently at some point the history was forgotten (or it was too difficult to understand) and the philosophers came up with their own new "paradox", which is really boring ... till you get to the version by logician Smulyan. Then at last it starts to get amusing, at least. Richard Gill (talk) 17:29, 16 November 2014 (UTC)

YADD: an argument resumed
Answering this one. To Richard Gill.

The envelopes were labelled A and B before the choice was made, and they were labelled C and D after the choice was made, C being the envelope that was initially picked; I am sorry if I caused confusion, I was sure I had described the difference. Let me say that A is the envelope in the left hand, and B is one in the right hand. You say I need to prove that the envelope which enters the right-hand position (B) has, with equal probability, either twice as much or half as much money as the envelope which enters the left-hand position (A). However, that is what the word "indistinguishable" means to me: there is no sign that one envelope is likely to have more money than the other one, and the hand the envelope initially enters is not such sign either. Another thing is that I do not see why anything should depend on the exact amount of money ("values of a") in an envelope howsoever defined. Thank you for your feedback. - Evgeniy E. (talk) 15:35, 15 December 2014 (UTC)
 * There is a different problem with my argument that I have just noticed, though: mean(C) and mean(D) are defined in the units of the money amount in A, whose values in money units should differ in different experiments; one may set $x to be the same always, but this is not realistic. However, this is a pseudo-problem, because I may set as the random variables denoted C and D the ratios of the money amounts in envelopes C and D to the money amount in envelope A, rather than the money amounts in envelopes C and D themselves, and proceed with saying that I am interested to know mean(C)/mean(D) and mean(D)/mean(C) rather than mean(C/D) and mean(D/C). In any case, treating the problem as a linguistical one rather than a mathematical one and treating different definitions of the same objects as completely different things, that have nothing in common, that is to say do not bind each other to have the same mathematical properties, is an approach that is both interesting and rational… - Evgeniy E. (talk) 18:49, 15 December 2014 (UTC)

Now, the question is which step is wrong in the argument in the article. The answer is simple: the mistake is in the step 8, since "the swapped envelope" is not "the other envelope". Why? Because the operation of swapping works on actual, real envelopes (let me call them this way; A and B in my designation), while the relationship of otherness works on virtual envelopes (C and D in my designation), those existing in my head only and related to strategy setups in my mind. The previous step establishes that the ratio of the amount in the other envelope to the amount in the picked envelope is 1.25, while the ratio of the amount in the picked envelope to the amount in the picked envelope is 1. Yet, the operation of implementing a choice presumes finding out the amounts of money on the base of real envelopes, not on the base of strategic, imaginary ones; so, the conclusion in the previous step has nothing whatsoever to do with this operation, and this operation is not ordered to happen. Then, the argument halts. Cure? Be more careful in interpretation of mathematical results. The analysis in the step 7 did not describe how to choose the right outcome, it only described how the outcomes were distributed: the average of the ratio of the two outcomes was greater than one. - Evgeniy E. (talk) 21:18, 15 December 2014 (UTC)
 * I have read here references to some arguments involving the distribution of money amounts in given out envelopes; I admit I did not really understand the causing nature of these arguments, but I do not think any such considerations are relevant. The purpose of the player is not to gather as much money as possible, it is to be at a win as often as possible. So, the probability variable whose mean is evaluated must not be any amount of money, it must be a special "success" variable, which is dimensionless. How to define the latter variable? I had the way of making it equal to the ratio of the eventually chosen amount of money to the amount in the left-hand envelope (no matter what this amount is) given the strategy in a given experiment; if the player wins, that means that his strategy (take C or take D) results in the greater value of this variable than the other possible strategy. More specifically, this variable took, with equal probabilities, four values: 1, 2, 1, and 0,5, as it is both unknown which kind of relationship exists between the two envelopes and which envelope is initially chosen from the two. As to assigning the probability values: the two couples of events are pairwise independent; the probabilities of the former two events are assigned a half each per the meaning of the word "indistinguishable", while the probabilities of the latter two events are a more complicated issue. It might seem that any distribution is possible; however, the solution only works out if the two events are assigned probabilities a half each, otherwise one of the strategies turns out to be better, which makes no sense... Now, to find out which strategy is best, we calculate the average of this success variable over many experiments given the strategy and then choose the strategy that resulted in the better mean. Analysis showed that both strategies provided the player with the same mean. There is no reason to believe that assigning such probabilities inflicts any consequences on reasoning about the probability distribution of money amounts in all envelopes to be potentially given, because the matter to be considered is rationality of the choice, not the amount of money received. All reasoning is not holeless as of now, but it seems to me amusing at least. I am done with it by now, though. Thanks to all for listening, if anyone has. - Evgeniy E. (talk) 16:16, 16 December 2014 (UTC)
 * The problem with this puzzle is that it is a self-inflicted injury in that it is quite obvious to most people that you should not swap, and that is also mathematically correct. It is necessary to present a bogus argument for swapping in order to create a paradox at all.  Without the bogus argument there is no paradox; everyone agrees that it is pointless to swap.  Resolving the paradox does not require you to show that swapping is pointless, that is obvious.  To resolve the paradox, you must show exactly where, in the (mathematically vague) line of reasoning the fault lies.  Depending on exactly how you interpret the proposed logic for swapping, the fault will be different.  Martin Hogbin (talk) 17:01, 16 December 2014 (UTC)
 * Well, for me the reason I visited that page at all was a question I asked to myself: "why, if the risk of some action is considerable, but the risk appears unlikely, I still may feel like opting for the action, provided I know that the trouble simply will either happen or not?" This is a question of my feeling, of my internal brain/mind-functioning, this is not a question of reason… I turned to the conclusion that, since the brain has to choose the same way of functioning in every dubious situation of similar nature (just because it has a "mechanical" nature and cannot act differently in the same circumstances), it had to evolve so as to choose such a way that leads to the best results over the course of many dubious decisions in total. That means that the brain must inherently "understand" the basics of the probability theory, that is the conclusions of this theory must be intuitive for it; whenever our intuition fails us, the reason must lie in being short of taking into account some side notions not pertaining to the probabilistic estimation of the expectation itself. Then, I was interested to think, not very successully, on the question, what kind of simple automatic "reasoning" must induce the brain into making the correct estimation that the strategies are equal; the simple ad absurdum argument is probably not enough; evidently, the brain ought to see that the strategies are not just equal, but the same from a certain point of view. Hence these ponderings. As to your next question, I'll try to find out how to make my reply more clear… - Evgeniy E. (talk) 17:45, 16 December 2014 (UTC)
 * I do not understand your argument, 'the mistake is in the step 8, since "the swapped envelope" is not "the other envelope"'. Martin Hogbin (talk) 17:04, 16 December 2014 (UTC)
 * In solving a problem with assistance of mathematics there are three stages: 1) from a natural inner representation of knowledge derive a mathematical structure that represents its shape, 2) find out substructures in the mathematical structure that describe the shape of the answer, 3) interpret the mathematical answer by assigning this shape to knowledge about our intuitive notions. Mistakes may happen in any of these stages, however forming a wrong mathematical structure on stage 2 is not yet a fault. To be able to prove that any mistake was made on stage 3, one needs to know how mental reflections of objects, i.e. definitions of things, are organised in our mind; since we don't know that, usually this problem is decided intuitively, we see that "this mathematical result is not about that!" and merely declare it. The right approach would be to come up with at least a half-formalism to describe reflections of objects in our mind for this case, and with a description how this half-formalism should work with the notions of this problem, to show why I think the mathematical result in question does not provide us with a reason for any action. However, so far I try yet again an intuitive description. The question is what constitutes a reason for selecting an envelope. Yep, I know that if I compare (by taking a ratio) any couple of choices (the initial one and the switched one) each time an experiment is made, then the average is greater than one; that follows merely from the nature of the choices. Naturally, this argument works in both directions. However, the values that were assigned to the corresponding variable, both times 1 for the first choice, and 2 and 0.5 for the second choice, were not derived from any rigid unit describing real envelopes; hence, they did not describe the chosen envelopes ("the swapped envelope"), but only the choices themselves ("the other envelope"). They were derived from the definition: "the defining feature of A is that I picked it; the defining feature of B is that it is the other envelope I did not pick". In these definitions, there is nothing to describe A and B as real envelopes with properties, communicated to such envelopes by the formulation of the problem. In order to make a decision, I need to compare a couple of envelopes, not a couple of choices; so, there is no ground provided for a decision yet. The idea of those who follow the original argument (provided we understand "A" in 5A/4 and in A as a kind of unit, the real values being 5/4 and 1) is apparently along the lines "since the definition so formulated is applied to a real-world object, the properties of the real-world object and the properties of the thing I defined belong to the same thing that behaves the same and shares all the same properties"; that is not a convincing idea, and the question "what constitutes a reason for an action" still holds.


 * By the way, it is interesting to note that in my approach of yesterday, only when the probability distribution of the initial choice is 0.5 to 0.5, the correct conclusion is reached; if it is 1 to 0, then my approach coincides with the approach described in the article; if it is 0 to 1, then it degenerates into another paradox in which it is disadvantageous to change the choice. All other distributions provide slighter kinds of advantage to the first or to the second strategy. Apparently, we must assign probabilities 0.5 and 0.5 to these events of initial choice, but I cannot explain why and I cannot interpret this behaviour. - Evgeniy E. (talk) 07:24, 17 December 2014 (UTC)

Evgeniy, you said "Now, the question is which step is wrong in the argument in the article. The answer is simple". The answer is not simple. Because of the Anna Karenina principle, there is not one particular way to try to repair the switching argument, hence there is not one particular way to say it goes wrong. Read my paper. http://www.math.leidenuniv.nl/~gill/tep.pdf There are several different interpretations of what the writer of the switching argument was trying to do, and hence several different interpretations of where his argument goes off the rails.

One thing you can be sure of: the people who invented the paradox knew about Laplace's principle of insufficient reason and believed that one makes rational choices by optimising mathematical expectation values, using subjective (Bayesian) probabilities. And most people who tried to solve the paradox were also familiar with the same general context. The paradox grew up in a certain cultural context. Of course if you take it completely out of that context you will have a very different way of looking at it. Fortunately, as long as there are no notable publications about the paradox of such a kind, we editors of wikipedia don't have to worry about the possibility. Richard Gill (talk) 08:58, 21 December 2014 (UTC)
 * Hello again. Well, I moved on from the discussion (I have had my own line of interest and am not really interested in mathematical details, thank you though for your link); also, I do not believe in ability of any mathematics to represent the concept of subjective knowledge, for instance because mathematics are always closed knowledge, but humans in their life work with open knowledge, one in which there is no limit to our ability to guess. I mean, pointing out a mathematical theory leaves no room for free interpretation of any of its constituent notions, such mathematical theory is all that defines all features of these notions; while I believe some theory must explain all we do with our notions, I also find that in the formulation of such theory the notions, that describe this theory in an all-encompassing way, and the notions, that this theory describes, must be independent concepts, i. e. that a theory of behaviour must not be a representation of our notions, it only must explain how our notions behave in time. Please let me state here, though, a stylistic objection: I think that simply saying "The argument may go wrong in many ways" is better, i. e. more to the point, than invoking a specific principle from evolutionary genetics (which has no more to do with Tolstoi's "Anna Karenina" than "St. Peterspurg paradox" has to do with St. Petersburg, this is a name without significance); I would define the opening line in "Anna Karenina" as (nearly?) the only one meaningless sentence in it, whose function is mostly to name the subject of the imminent discussion (a family, a human in a family, the principles of setting out a human's behaviour, tightly connected with that human's self-feeling, in many circumstances, also possible ways of coping with such principles in order to make a human feel content in a decent way), so its reuse makes a very weird sloganly sound. With interests so different, I think we have better to pass by. :-) - Evgeniy E. (talk) 17:17, 21 December 2014 (UTC)

I do not ignore the recent posts, but I composed these two answers before those posts appeared. Of those placed in the article, Smullyan's version is most appealing to me. I think that very similar considerations apply to it. As regards thinking, a thing is a definition thereof, inner to the mind. Of any thing, we define our behaviour with it by what this thing is; what this thing is is most important for us, because we change our feeling about this thing whenever we have to change our understanding of what this thing is. A number of things may be a number of other things not independently from one another; we may say of such groups of things that they form two relationships, the lower one and the higher one. For example, when I approach a house, I am an approaching thing, and a house is an approached thing; I and a house are in a realationship, and an approaching thing and an approached thing are in a relationship. Now, the imaginary envelopes, A and B, are defined by the relationship, "B is either a double or a half of what A is, where A and B are both decisions made about picking an envelope". The real envelopes, not characterised by any such choice, namely X and 2X, are defined by the relationship, "2X is a double of what X is, where X and 2X do not depend on my subjective state". Since definitions are different, things are also different and conclusions about them do not have to harmonise, i. e. conclusions about them are separate and cannot contradict one another. We have made two conclusions:

1. Switching B over A is able to provide greater gain than loss. 2. Switching 2X over X and switching X over 2X provide the same gain and loss respectively, and only one of these actions is possible.

In other words, in both cases, we have two actions, M1 and N1, and M2 and N2: Mz with a gain and Nz with a loss; in the first case, the gain action is more valuable than the loss action (|M1|>|N1|), and in the second case, the gain action and the loss actions are equally valuable (|M2|=|N2|). To show a contradiction, we need to show that M1 and M2 are the same action, and that N1 and N2 are the same action. Yet that is not possible. The form of these actions is the same, but these actions are employed with different things; in order to show the things are the same, we need to show that, for example, A is X and B is 2X (1), or cross-wise, or the other way around (X is A and 2X is B). That is impossible and should not happen; but whenever we assume that, we get a contradiction. In my probabilstic version, whenever probabilities assigned demonstrate a preference for either the conclusion (1) or its cross-wise alternative, the conclusion is paradoxical, i. e. that either the initial envelope or the switched envelope provides an advantage. - Evgeniy E. (talk) 17:17, 21 December 2014 (UTC)
 * Too bad you are not interested in the mathematical details. The two envelope problem is primarily a paradox invented by mathematicians for the entertainment of mathematicians. Later the philosophers (and logicians) came in and added some new twists, turning it into a word game. OK. Richard Gill (talk) 08:23, 4 January 2015 (UTC)