User talk:Gill110951/Probability notation

Background
Among the many things that were fought about for years (leading to mediations and arbitration) over on the wikipedia Monty Hall problem page, were mathematical notations in elementary probability settings. Someone asked me if I'ld like to write an essay on this topic. So that is what is happening on the companion user page to this discussion page.

Please discuss!!!
Richard Gill (talk) 18:13, 21 March 2011 (UTC)

And please, feel also free to edit the "User page" for this essay, as well. I can always revert if I don't like what you do. I'm lazy so I'll be really happy if you do obvious corrections and improvements and drafts of additional sections "in situ". Richard Gill (talk) 18:42, 21 March 2011 (UTC)


 * A grandiose project started... Now I wonder what will ultimately happen to this text. --Boris Tsirelson (talk) 18:44, 21 March 2011 (UTC)


 * Isn't it great to start a grandiose project? I wonder if this will decay into a dirty fight, and then migrate to a more civilized alternative possible world... Richard Gill (talk) 19:01, 21 March 2011 (UTC)

Set of possible worlds
Classically, one can view an event (in the sense of probability theory) as subset of the set of possible worlds. A random variable is a function from the set of possible worlds to the real numbers. We assume that there is given a priori a unique probability measure on the set of possible worlds. Although this goes beyond notation, should it not be mentioned in the essay? JRSpriggs (talk) 18:53, 21 March 2011 (UTC)


 * Well, we don't necessarily assume there is given a unique probability measure on the set of outcomes of a probability space. At least, not if we are mathematicians working in probability theory. We can consider lots of different probability measures on the same probability space.


 * The essay is supposed to help people writing rather elementary articles in probability and statistics and elsewhere (elementary from the point of view of maths!) so there are a lot of things I don't want to get into. Richard Gill (talk) 19:04, 21 March 2011 (UTC)

Quantum probability theory
At its root, quantum theory is based on a replacement of classical probability theory by quantum probability theory. In quantum probability theory, the possible worlds of the classical theory become a basis for the vector space of pure states (a Hilbert space). Mixed states and observables are just linear operators which take that Hilbert space to itself. This changes the rules for working with non-commuting variables. How does that affect your notations? JRSpriggs (talk) 18:53, 21 March 2011 (UTC)


 * Splendid! It depends whether I use standard notations or my own notations. I have thought a lot about this and written about it too. Also, of course, it depends whether quantum theory is based on a replacement of classical probabiliy theory by quantum probability theory. From one point of view one can see quantum probability theory as a generalization of classical probability theory. But from another point of view one can see it as a specialization of classical statistical theory (with design of experiments).


 * We have two worlds of discourse. We choose to connect certain concepts in one of those worlds with analogous concepts in the other world. But one can see different analogies when one looks in the other world at different levels. Richard Gill (talk) 19:00, 21 March 2011 (UTC)

Russian
"Since the Russian for mathematical begins with M but for moral doesn't" -- really? I am a native Russian speaker, but I did not get the hint. Any details? --Boris Tsirelson (talk) 20:00, 21 March 2011 (UTC)


 * OK. So writing M(.) was not in order to distinguish moral expectation from mathematical expectation. But then it is done to distinguish mathematical expecatation from just plain expectation. Richard Gill (talk) 06:07, 22 March 2011 (UTC)

Pr

 * Probabilities are very often represented by writing plain vanilla $$P( \cdots )$$, meaning the probability of .... Some writers like $$Pr( ... )$$ and others $$Prob( \cdots )$$.

Could I suggest writing
 * $$ \Pr(\cdots) \, $$

instead of
 * $$ Pr(\cdots) \, $$

Michael Hardy (talk) 02:54, 22 March 2011 (UTC)

Yes, I need to clean up this bit. There are some words about it later. I should better be a bit more careful from the start. Richard Gill (talk) 06:10, 22 March 2011 (UTC)

I've usually seen Pr[...] with square brackets. 69.111.194.167 (talk) 10:05, 21 April 2011 (UTC)

Humor is distracting
I understand the temptation to liven up the dull process of writing or reading your work by adding jokes. However, these distract from the content. The reader (and writer) should be focusing on the pros and cons of various notations, not on the jokes. JRSpriggs (talk) 09:24, 24 March 2011 (UTC)
 * Hope this was a joke? You see two worlds collide, and so you need humor to survive - and to understand. And you are right happy at the sudden synopsis. So yes, you need both. To take a breath and to breathe out again. Exertion and clash, and then redemption, systole and diastole. It's wonderful. Gerhardvalentin (talk) 11:27, 24 March 2011 (UTC)
 * The very heavy use of parenthetical remarks is distracting, not only the use for humorous remarks. Eg, there are many many instances like the use by JRSpriggs atop this section.--P64 (talk) 18:17, 22 April 2011 (UTC)

Purpose
Quoting Richard Gill from sections Background and Set of possible worlds. Please tell more about the purpose. The essay doesn't yet recommend much, and it does so elliptically, for example by allusion to pedantic people, presuming that no reader wants to be one of them.
 * One matter disputed at Monty Hall problem "were mathematical notations in elementary probability settings. Someone asked me if I'ld like to write an essay on this topic."
 * "The essay is supposed to help people writing rather elementary articles in probability and statistics and elsewhere (elementary from the point of view of maths!)"

How if at all is this a step toward some Essay or Guideline or Policy in wikipedia space?
 * &mdash;or even an Article. That didn't occur to me but see the reply by ISP 69. (See also WP:PG

--P64 (talk) 17:33, 21 April 2011 (UTC) Are there lessons implicit here, which would have preempted much of the Monty Hall affair? --P64 (talk) 21:02, 20 April 2011 (UTC)


 * I think the page could be adapted into a wikibook. They're less contentious over there than on wikipedia. 69.111.194.167 (talk) 10:10, 21 April 2011 (UTC)

topic/question
I wonder if there is an accepted formalism for describing sampling (observing) a random variable. I asked about this here a while back, but got no sensible answers. If there is such a thing, maybe it could be mentioned in the essay. 69.111.194.167 (talk) 10:15, 21 April 2011 (UTC)

Comments
Afterword: None of my edits pertain to these comments. Editing, I have done little but use f and ( | ) notation more frequently, streamline the talk of random variable "outcomes", and copyedit a little. To be continued if there is interest. --P64 (talk) 00:50, 22 April 2011 (UTC)
 * I would drop the allusion to excluded middle, any statement "which is either true or not". Whatever the point, I suppose it is beyond the intended audience.
 * In American English that I know, parentheses are round, brackets are square, and braces are curly. (That is, they were when and where I was a schoolboy.)
 * Today I noticed Stephen Stigler (in his History) use braces alone, then parentheses alone, re the same probability expression. For a simple example, "P{X=10} ... P(X=10) ... ". My guess is that he uses braces and staffpeople have incompletely converted to the publisher's style.
 * I learned set notation using pipe rather than colon. For example, {x | x>10} or {x e R | x>10} rather than {x : x>10} or {x e R : x>10}. That may be related to and distinguished from conditional probability notation, which may be useful given the role of sets in some treatments of probability.
 * Cumulative distribution or CDF should be covered along with the two meanings of distribution (by physicists and mathematicians) distinguished in paragraph "Those 26 interesting ...".
 * >"The smartness of Bayes was not to discover Bayes' theorem, which is completely obvious ..." &mdash; Bayes theorem for events or for discrete random variables?
 * Anyway, you will be interested to read Stigler's History of Statistics on inverse probability, much of two chapters about Bernoulli, Simpson, Laplace, Bayes. He explains three or four "smartness of Bayes" and may also convince you that nothing is completely obvious. --P64 (talk) 18:04, 22 April 2011 (UTC)


 * Coverage of Odds may be valuable in itself. --P64 (talk) 19:43, 23 April 2011 (UTC)

Odds
1.) Certainly a section on Odds "notation" may be valuable for this article, whether it be developed as an article on Probability notation or as an essay/guideline for wikipedia editors. --P64 (talk) 19:21, 23 April 2011 (UTC)

2.) Last week you asked at Wikipedia talk:WikiProject Statistics, and someone linked to a Stephen Stigler article that is reprinted on the web,
 * > "Does anyone here know who discovered Bayes' rule? (Posterior odds equals prior odds times likelihood ratio or Bayes' factor). Richard Gill (talk) 07:25, 15 April 2011 (UTC)

Based on the current article, you do mean specifically the understanding or expression in terms of odds. Here and elsewhere you have argued for that exposition of the Monty Hall problem, and you say here "I don't know who should be credited with that."

In this article you say,
 * > "I suspect this was post Bayes, post Laplace ... probably somewhere in the 20th century. Probably an Anglo-Saxon with a gambling addiction, since I don't think any other language has the concept odds.

Editions of wikipedia in other languages do suggest that many have simply adopted the English word. (Visit "Odds" in Edit mode and see the last fifteen lines of code.) But is it true that no others developed the same idea? --P64 (talk) 19:21, 23 April 2011 (UTC)


 * 2a.) Stigler attributes to LaPlace 1774 the principle that posterior odds are proportional to likelihoods, and discusses the supposed equality of prior odds --in my use of modern terms. He quotes LaPlace's principle as follows, translated himself.
 * LaPlace 1774 from French, "If an event can be produced by a number n of different causes, then the probabilities of these causes given the event are to each other as the probabilities of the event given the causes ..."


 * 2b.) The concept of odds is not specifically English. Arguably all these 17-18c writers use the concept, one writing in English. As far as I noticed in these two secondary sources, one has been translated to English using the word 'odds' (Huygens 1657 from Latin).
 * From F.N. David, Gods, Games, and Gambling (1998[1962]).
 * Pascal 1654 from French, attributes to M. le Chevalier de Mere the observation "If one undertakes to get a six with one die, the advantage in getting it in four throws is as 671 is to 625." Fermat and Pascal went on to handle the problem of points in terms of proportional chances to win a game and proportional division of stakes, such as 17:5:5 for three players.
 * Huygens 1657 from Latin, "He who throws and wagers for a six in 2 throws has odds therefore of 11:25 ...". Later, regarding a game with two players, "The ratio of our chances [to win] is therefore 31:30."
 * From Stephen M. Stigler, The History of Statistics (1986).
 * deMoivre 1733 from French, as presented by Stigler: "he tripped up at one point in converting these probabilities to odds. ... [In 1733] he gave the odds of exceeding the given limits correctly, as respectively, 28 to 13, 21 to 1, 792 to 1. ..." That deM converted to odds from probabilities he evaluated by integration, such as 0.682688, implies to me that he deemed the odds concept relatively familiar to his readers.
 * Simpson 1755 from English, as presented by Stigler: Some probabilities are expressed as ratios, such as 16 to 20 or 8/10 to 1, from analysis of 36 cases in the toss of two dice.
 * LaPlace 1786 from French, as presented by Stigler: Having twice calculated posterior probabilities near 1/410000, "he was content to describe the posterior odds as 'better than 400,000 to 1'."
 * --P64 (talk) 22:30, 25 April 2011 (UTC)

But what does it all mean?
Despite the best efforts of mathematician to formalise probability think I think that we are no nearer to understanding what it means, or even what we mean by it, than we were centuries ago. Ignoring QM I still think that the only real meaning of probability is that proposed by Bayes as a degree of knowledge.

You may ask why I have brought up this subject here and what relevance to notation it has. My point is this. Elementary probability is just elementary mathematics, there is nothing to argue about, every question has just one answer, unless you want to attach some real world meaning to the question and the answer. So, whilst I agree that probability notation is very useful it cannot answer every question, or even any real word question, for us.

Anyone who disagrees might like to give me an answer to this question: A goat is placed behind one of three doors numbered 1 to 3, what is the probability that the goat is behind door number 1?

Or if that is too hard how about this question: A goat is placed with equal probability behind one of three doors numbered 1 to 3, what is the probability that the goat is behind door number 1? What exactly does your answer mean?

My point is that no amount of probability notation will help you answer those two questions. Martin Hogbin (talk) 15:34, 11 August 2011 (UTC)