Talk:Geometric distribution

Definition of p
Shouldn't the success probability, p, be defined on the open interval (0,1)? For p=1 few quantities are not defined, such as median, skewness or excess kurtosis. fuzzy (talk) 08:57, 14 September 2015 (UTC)

Coupon Collector's Problem
We should mention the coupon collectors problem here, i.e. the number of trials (on average) needed to complete a 'set of coupons' given a uniform unerlying distribution of coupons.

The classic example came from cigarette cards or coupons. How many packs of cigarettes do you need to buy (on average) to collect all 5 movie stars? The answer is 11+5/12, assuming the underlying distribution of movie stars per pack is uniform.

This is a result of the geometric distribution, =   5/5 + 5/4 + 5/3 + 5/2 + 5/1 = 5( 1/5 + 1/4 + 1/3 + 1/2 + 1/1 ) = 11+5/12

and is related to urn problems, the poisson distribution and generating functions.

I have been looking for a proof that the expected value of a geometric distribution is 1/p, and have been unable to find anything. Does anyone have a link or know the proof? THN

The simple proof
$$ E(X)= \sum_{k=1}^{\infty} kpq^{k-1} = \sum_{k=1}^{\infty} k(1-q)q^{k-1} = $$ $$ \sum_{k=1}^{\infty} kq^{k-1} - \sum_{k=1}^{\infty} kq^{k} = \sum_{k=0}^{\infty} q^{k} = \frac{1}{1-q} = \frac{1}{p} $$

–Gyorgy 11:40, Feb 02 2012 (EST)


 * This is wrong: $$\sum_{k=1}^{\infty} kpq^{k-1}$$ should read $$\sum_{k=1}^{\infty} kpq^k$$.131.111.184.8 (talk) 15:43, 24 May 2012 (UTC)


 * Horrific idiocy on my part - it's actually correct; $$\sum_{k=1}^{\infty} kpq^k$$ would be true if referring to Y.131.111.184.8 (talk) 15:49, 24 May 2012 (UTC)

proof of expected value of geometric distribution
Source: Mitzenmacher and Upfal. Randomized Algorithms and Probabilistic Analysis: A First Course.

Lemma. Let X be a discrete random variable.


 * $$E[X]$$
 * $$=\sum_{i=0}^\infty i\Pr[X=i]$$
 * $$=\sum_{i=0}^\infty i(\Pr[X\geq i]-\Pr[X\geq i+1])$$
 * $$=\sum_{i=1}^\infty \Pr[X\geq i]$$

Corollary. If X is a geometric random variable, then:


 * $$E[X]$$
 * $$=\sum_{i=1}^\infty \Pr[X\geq i]$$
 * $$=\sum_{i=1}^\infty\sum_{n=i}^\infty(1-p)^{n-1}p$$
 * $$=\sum_{i=1}^\infty(1-p)^{i-1}$$
 * $$=\frac{1}{1-(1-p)}$$

QED. =)

No time right now to put this into the article, I'm afraid. Can someone else do it? –Matt 10:46, 19 Jun 2004 (UTC)

The present proof justifies interchange of sum and differentiation by uniform convergence. However, uniform convergence does not imply convergence of the derivatives. The justification is wrong. However, it is true that for power series, you can differentiate inside the sum when you are in the interior of the disk of convergence. —Preceding unsigned comment added by 140.180.13.173 (talk) 19:40, 30 September 2010 (UTC)


 * So it's not exactly a telescoping sum, but the way to get from
 * $$\sum_{i=0}^\infty i(\Pr[X\geq i]-\Pr[X\geq i+1])$$
 * to
 * $$\sum_{i=1}^\infty \Pr[X\geq i]$$
 * could be explained in a somewhat similar way. That explanation should be included. Michael Hardy (talk) 23:38, 30 September 2010 (UTC)

squares

 * (1 &minus; p)/p2.

Does anyone else see a square in place of the minus sign? –Matt 10:48, 19 Jun 2004 (UTC)

Graphs need to be redone
Geometric distribution is discrete, therefore it is determined by the probability mass function, not density. In MATLAB one should use stem to plot it. Consequently, the cdf of the geometric distribution is a piecewise constant right-continuous function, not piecewise linear. I may redo graphs myself next week. If someone else can do it sooner, that'd be great. PBH 17:01, 18 May 2006 (UTC)

You are right, my bad. I remade the graphs trying to correct the mistake. However, I didn't use stem as I think it looks bad and it makes the graph unclear when you have more than one function. If you think the graphs are correct now, pleace remove the notice. If you still don't like them, please tell me why AdamSmithee 16:29, 18 May 2006 (UTC)

As far as stem is concerned, the problem is the choice of parameters. Pmf's of Geom(0.8) and Geom(0.2) coincide at 1. A better choice would be 0.2, 0.5, 0.7 or something like that. At least put filled dots at integer points. Also, personally I don't very much like the staircase look of the cdf. It does not display right-continuity well. There really ain't no "vertical" lines in a graphof any function. I'll probably make cleaner graphs for the Russian version. However, the graphs now are more acceptable, so I'll remove the notice. PBH 17:01, 18 May 2006 (UTC)


 * I'm not really that crazy about them, but unfortunately my graphic capabilities stop around here. If you can do something better, by all means replace them AdamSmithee 19:04, 18 May 2006 (UTC)

I've commented out the graphs because I could not edit the CAPTIONS to make them cease to be misleading. Here's the problem:


 * One graph shows the p.m.f. of Y and not of X;
 * The other shows the c.d.f. of X and not of Y.

But they're presented in a way that suggests they're both about the same probability distribution! This really needs to get fixed. The graphs should be made consistent with the information in the table. Michael Hardy 22:33, 1 June 2006 (UTC)

only ONE parameter
This family of distributions is parameterized by just one parameter, p. The argument n to the probability mass function is not such a parameter, i.e., we don't get a different distribution for each different value of n. I deleted the "n" from "parameters" in the table. Michael Hardy 00:54, 31 May 2006 (UTC)

derivation of moment generating function
this might be somewhat useful although it does not belong in the article. The process involves massaging the sum into the geometric series so it can be reduced furthur.

$$q=1-p$$

$$M_X(s)=E(e^{sX})=\sum_{k=1}^\infty{pq^{k-1}e^{sk}}=\sum_{m=0}^\infty{pq^{m}e^{s(m+1)}}=pe^s\sum_{m=0}^\infty{(qe^s)^m}=\frac{pe^s}{1-qe^s}$$

Iownatv 09:01, 20 January 2007 (UTC)

What are the subscripts for?
Why the two subscripts in p1 and p0? Is there a reason not to call them both just p? It seems as if user:MarkSweep introduced them and uses one when talking about X, the number of trials needed to get one success, and the other when talking about Y, the number of failures before the first success. But the difference between the two random variables does not correspond to any change in p. On the contraray, in some contexts it may be important that p does not change. Michael Hardy 02:44, 6 May 2007 (UTC)


 * ... and now I've changed it. Michael Hardy 02:50, 6 May 2007 (UTC)

Characteristic function is wrong?
I believe the characteristic function should be $$\frac{pe^{it}}{1-qe^{it}}$$

60.242.107.81 11:48, 30 May 2007 (UTC) Dmitry Kamenetsky


 * (I've moved this to the end of the page where new sections are traditionally added). Mathworld agrees that there is no $$e^{it}$$ in the numerator . Any particular reason to believe there should be? digfarenough (talk) 13:02, 31 May 2007 (UTC)

Digfarenough, you are wrong. MathWorld does not say that. You did not read carefully.

TWO DIFFERENT geometric distributions are considered:
 * ONE supported on the set {0, 1, 2, 3, ...}, whose characteristic function is
 * $$ \frac{1-q}{1-qe^{it}}, $$
 * and ANOTHER, supported on the set {1, 2, 3, ...} whose characteristic function is
 * $$\frac{pe^{it}}{1-pe^{it}}.$$
 * Sigh.....
 * I keep coming back to this talk page, and the one at negative binomial distribution, and repeating this point over and over and over and over, again and again and again and again....
 * This article is very explicit in saying right at the beginning that TWO DIFFERENT geometric distributions should not be confused with each other. Michael Hardy 19:52, 31 May 2007 (UTC)
 * I keep coming back to this talk page, and the one at negative binomial distribution, and repeating this point over and over and over and over, again and again and again and again....
 * This article is very explicit in saying right at the beginning that TWO DIFFERENT geometric distributions should not be confused with each other. Michael Hardy 19:52, 31 May 2007 (UTC)
 * This article is very explicit in saying right at the beginning that TWO DIFFERENT geometric distributions should not be confused with each other. Michael Hardy 19:52, 31 May 2007 (UTC)
 * This article is very explicit in saying right at the beginning that TWO DIFFERENT geometric distributions should not be confused with each other. Michael Hardy 19:52, 31 May 2007 (UTC)


 * Ah, you're right: the infobox on that page is about the distribution with support {1, 2, 3,...}; I didn't notice the mathworld page used the distribution with a 0 in its support. I suppose it would be too messy to give the quantities for both distributions there. I haven't seen any signs of you repeatedly mentioning this point on this talk page, though. That seems to have happened over at negative binomial distribution. I understand the frustration that sort of thing can cause. It could probably be avoided by specifically giving the quantities for each definition of the distributions somewhere in the article. digfarenough (talk) 22:44, 31 May 2007 (UTC)

Confusing Info Box
The information box is confusing. In some places it shows two versions of the geometric distribution (e.g. support, CDF). In other places it only shows only one version (mean, characteristic function). It needs to be cleaned up. My preference is to show only one version of the distribution. Steve8675309 03:17, 3 August 2007 (UTC)

Wrong Mean
Am I wrong or is the expected value / mean of a geometrically distributed random variable 1/p not (1-p)/p? What's up with that? —Preceding unsigned comment added by 209.244.152.96 (talk) 19:37, 14 November 2007 (UTC)
 * This article assumes the distribution is over the positive numbers. You probably learned the mean over the nonnegatives, for which the pmf is $$(1-p)^k p$$ and the mean thus one less, or 1/p. Calbaer 00:47, 15 November 2007 (UTC)


 * No you guys have it backwards, for non negative numbers the mean is q/p and for positive numbers the mean is 1/p. This article has the pmf and k defined for positive numbers so the mean should be 1/p.  I really don't feel like posting a proof in here so would someone confirm this for me.  I have the proof though (for both k defined over non-negatives and over positives) so if no one can get this straight I'll post both.  —Preceding unsigned comment added by 209.244.152.96 (talk) 18:45, 20 November 2007 (UTC)
 * Okay - sorry. I looked quickly at the article and assumed it had the correct of the two forms; the way your question was phrased, I assumed you were merely unaware of the two forms.  If you look above on the talk page, you'll see that there's much debate and confusion about the two forms, and the whole reason the wrong mean was given was the incomplete expansion from one form  to two  and back again.  The combined version was too confusing for one infobox, though it might be a good idea to have it presented with two infoboxes.  However, I'm unaware of article with two infoboxes of the same type, although there are plenty with infoboxes of two types, as discussed at Wikipedia talk:WikiProject Infoboxes. Calbaer (talk) 19:30, 20 November 2007 (UTC)

I've commented out the table
WHAT A MESS!! Someone just above this comments is AGAIN saying its the "wrong mean", etc. etc. etc. etc. etc. etc. etc. etc.

Sigh.

Do we have to dumb this article down again????

Why is it so hard to get this simple point across?

There's the geometric distribution on the set {0, 1, 2, ...}, starting at 0, not at 1.

And then there's the geometric distribution on the set {1, 2, 3, ...}, starting at 1, not at 0.

They have different means.

They have different mgf's.

And so on.

(They both have the same variance.)

So someone writes out the table for ONE of those two choices.

Then some doofus ignores the choice between starting at 0 and starting at 1 and says "But this is wrong! My textbook says..." etc. His textbook is right, but he is wrong, since the textbook paid attention to whether it starts at 0 or at 1, and he ignored it. It's a different person every couple of months doing the same thing in this same article.

It's tiresome. Michael Hardy (talk) 15:27, 15 February 2008 (UTC)

I've added a sketch proof for the expected value. Maybe the proof can sort of suspicion. Personally I think it would be nice if somebody could put up a proof for the variance as well. (Tobias) July 2008 —Preceding unsigned comment added by 85.24.188.87 (talk) 18:29, 20 July 2008 (UTC)


 * Completely agree with how tiresome it is. I would go further and suggest that the Geometric Distribution should be defined as the number of successes achieved before the first failure, that is: $$\Pr(Y=k) = (1 - p)\,p^k\,$$ defined on $$\left\{{0, 1, 2, 3, \ldots}\right\}$$. The "shifted geometric distribution" is what the other one could be called; I agree there is no consistency in the literature.


 * The distributions can conveniently labelled $$G_0 \left({p}\right)$$ for the $$(1 - p)\,p^k$$ and $$G_1 \left({p}\right)$$ for the $$(1 - p)^{k-1}\,p$$ version. This is how the subject is treated in M343: "Applications of Probability", Open University 1997. Having said that, I have not encountered a work which is consistently good and accurate anywhere - although the abovementioned is incisive and usefully enlightening in most places, there are unfortunate lapses. Does anyone know of a good text? My Grimmett and Welsh "Probability an Introduction" leaves too many gaps. --WestwoodMatt (talk) 22:56, 21 March 2010 (UTC)

The table
If the table has to be based on {0,1,2,...} or on {1,2,3,...} rather than including both, we need to make the warnings more conspicuous so we don't keep going over this same issue EVERY time YET ANOTHER careless person comes along and says "But my textbook says..." but ignores the choice between {0,1,2,...} and {1,2,3,...}.

I lean toward {0,1,2,...} since that embeds the geometric distribution within a family of infinitely divisible negative binomial distributions.

But there is also something to be said for {1,2,3,...}. The temptation may be to dismiss this as a triviality, but I think that would be foolish. Say you throw a die repeatedly until a "1" appears. On average, how many times do you throw it? On this question, everyone ranging from those who never think about mathematics to the most sophisticated mathematician instantly responds with the same answer, and it is correct. Contrast that with the relative intricacy of the actual computation of the average. Our formalisms are failing to capture something here. Michael Hardy (talk) 15:42, 15 February 2008 (UTC)


 * Is there a reason why the table cannot have two columns (three total), one for each support? This would allow readers to see quickly that there are two different (and acceptable) ways to represent the distribution.
 * If only one support can be used in the table, the logical choice would be to use that same support as the negative binomial distribution since the geometric is often referred to as a special case of the negative binomial distribution. The last versions of the geometric distribution table did not have 0 in the support, while the negative binomial currently does.  It is confusing to set r=1 for the negative binomial distribution, then go to the geometric page and have the mean not include 1-p in the numerator. Same goes for $$e^{t}$$ in the moment generating function. MichaelRutter (talk) 03:21, 16 February 2008 (UTC)
 * If only one support can be used in the table, the logical choice would be to use that same support as the negative binomial distribution since the geometric is often referred to as a special case of the negative binomial distribution. The last versions of the geometric distribution table did not have 0 in the support, while the negative binomial currently does.  It is confusing to set r=1 for the negative binomial distribution, then go to the geometric page and have the mean not include 1-p in the numerator. Same goes for $$e^{t}$$ in the moment generating function. MichaelRutter (talk) 03:21, 16 February 2008 (UTC)

I'd prefer including both. (I assume you meant two columns (unless I'm misunderstanding something)). I think a year or two ago someone said on this page he didn't like that, and it hasn't gone any further. I'm not good at formatting tables, so if you are, then go ahead and do it. Michael Hardy (talk) 17:52, 16 February 2008 (UTC)


 * I created a new template called Template:Infobox probability distribution 2. This has a second set of fields for the alternate form of the distribution.  If people don't like this, it is an easy edit to delete the second set of fields and revert to Template:Infobox probability distribution.  I am unsure of the median and entropy of the version with 0 in the support.  I suspect the entropy is the same, but they can be filled in later when verified.  I also cleaned up the characteristic function a bit in an effort to keep the table narrow.  Wish we could do the same thing with the median, but that may be difficult.MichaelRutter (talk) 18:29, 17 February 2008 (UTC)

Michael's done a great job with the two-column table. I can see that having two of each of the graphs would be a little OTT, however they should obviously be consistent. At the moment though the top one (mass function) is for the support {0,1,...} whereas the second (cumulative dist fn) is for {1,2,...}. I'm not able to produce a new good-looking graph, but if someone could drop in a replacement for either that would be great. Quietbritishjim (talk) 09:56, 9 June 2008 (UTC)


 * I created and uploaded some new figures for the graphs. I created all four images, so we can pick and choose which ones to keep.  —Preceding unsigned comment added by Lscharen (talk • contribs) 20:33, 15 September 2008 (UTC)

Fisher's Information?
Is there a reason that there is no Fisher's Information in the infobox? Jamesmcmahon0 (talk) 12:30, 11 December 2013 (UTC)

Arrangement of Article
Looking at the current location of examples on this page, it is difficult to understand the examples when the context of the example has not been explained. I think it would be easier to understand if, for example, expected mean and how to calculate it is explained and then an example is shown of that calculation. Per the Manual of Style/Mathematics, this also states that an introduction should be first and I think this applies to each section within the article (definition and then examples). --MLicon (talk) 18:05, 13 November 2019 (UTC)


 * Well, the appropriate policy here is WP:SOFIXIT. That said, I think the way the example situations are listed along with calculations of each type for each one is kind of weird.  I'm not sure I'm completely following what you're suggesting though with respect to introduction being first, etc.  –Deacon Vorbis (carbon &bull; videos) 18:20, 13 November 2019 (UTC)
 * Agree. This article is pretty confusing, and it also switches between the two different functions rather arbitrarily. ProcrastinatingReader (talk) 10:04, 11 June 2021 (UTC)

Higher-order moments
I may be mistaken, but I think the section on higher-order moments needs a clarification. The derivation goes



\begin{align} \mathrm{E}(Y^n) & {} =\sum_{k=0}^\infty (1-p)^k p\cdot k^n \\ & {} =p \operatorname{Li}_{-n}(1-p) \end{align} $$

I think the use of $$ p \operatorname{Li}_{-n}(1-p) $$ fails when $$ n = 0 $$, since this expression evaluates to $$ 1 - p $$ rather than 1. The infinite sum correctly evaluates to 1 when $$n=0$$. 94.8.242.146 (talk) 12:39, 7 May 2023 (UTC)


 * Note that this arises because $$\operatorname{Li}_s(z) = \sum_{k=1}^\infty {z^k \over k^s}$$, with the sum ranging over $$k$$ from 1 to $$\infty$$ and not from 0 to $$\infty$$, as required by the formula for the moment.
 * For any $$n\neq0$$, $${(1-p)^k p \cdot k^n}$$ is zero when $$k=0$$, hence the use of the polylogarithm for $$n>0$$ is justified because the missing $$k=0$$ term from the sum just evaluates to zero. But when $$n=0$$, the substitution of the polylogarithm fails. I will add a clarification. 94.8.242.146 (talk) 12:13, 13 May 2023 (UTC)

Some planned improvements
Hi! I recently read this article and I wanted to improve it, but I also wanted to sanity check my planned edits to make sure I'm not being disruptive.

I think the Definitions section seems a bit lengthy. The examples don't seem to provide much to a reader's understanding, and I was planning on axing them. I think the section should really be three to five sentences describing the distribution and its PDF along with the short example in the lede.

I also think the subsections on its expected value might deserve some cutting. The short proofs that show E(X) = 1/p and E(Y) = (1-p)/p in one line are nice, but the longer one for E(Y) doesn't seem to add much? Also, I feel the example for expected value could be shorter and use classic coin flips or dice throws instead of kidney donors.

One of the more difficult and possibly controversial(?) things I was thinking of doing is trying to make references of X and Y into explicitly writing it as the number of trials and the number of failures respectively. This is so a reader that skips the introduction because they are already acquainted with the distribution would still understand which definition of the Geometric distribution is being used and wouldn't make false assumptions about what X and Y are. This one would be harder than the rest to write, so I may not follow through with this.

Let me know what you think! I'm super open to feedback and discussing more. Moon motif (talk) 00:16, 3 June 2024 (UTC)


 * Just regardong the longer proof, I like putting these in a "proof" folding box.
 * You can see the article on design effect for examples. As for the rest, no clue - I'll let others review it :) Tal Galili (talk) 05:38, 3 June 2024 (UTC)