Wikipedia:Reference desk/Archives/Mathematics/2008 October 28

= October 28 =

Combinatorics problems
There are n partygoers. At the door, each of them will get a mask that's printed by a computer. The computer chooses the design for each mask independently and uniformly at random, from a collection of m, where m < n.

I'm interested in calculating: (1) the probability that exactly k of the partygoers get a mask that nobody else has, and (2) the expected number of partygoers getting a mask that nobody else has?

Is there a simple way to express the above without using iterated sums? —Preceding unsigned comment added by 72.94.50.163 (talk) 00:47, 28 October 2008 (UTC)


 * Well the second one I think is easier. Since expectation is linear, you should be able to just multiply n by the probability that any one partygoer will get a mask that nobody else has. --169.232.233.134 (talk) 04:25, 28 October 2008 (UTC)
 * Doesn't that assume each partygoer's mask's uniqueness is independent? If so, that isn't the case here. If the first m-1 partygoers all have unique masks, then the others must have the same mask. --Tango (talk) 11:36, 28 October 2008 (UTC)
 * No, expectation is linear, without any independence or such. Thus the expected number of partygoers with a mask no-one else has is $$n \left ( \frac{m-1}{m} \right )^{n-1} $$. Algebraist 11:42, 28 October 2008 (UTC)

As to your first question, here are some hints. Let's denote the set of the first $$\textstyle m$$ positive integers (that we use to label the masks) just by $$\textstyle[m]:=\{1,2,\cdots,m\}$$. Let's say that any $$\textstyle x $$ in a string $$\textstyle (x_1,x_2,\cdots,x_n)\in [m]^n$$ is a hapax  entry in this string, if it occurs exactly once in it (for example, 3 and 4 are hapax entries in (1,2,3,1,1,2,4) whereas 1 and 2 are not). So your question is, how many strings in $$\textstyle[m]^n$$ have exactly k hapax entries, out of the total number of strings $$\textstyle m^n$$ ? Your probability is, of course, this number, say $$\textstyle A(k,m,n)$$, divided by $$\textstyle m^n$$. The first thing to do should be: getting rid of the variable  k, and reducing the problem to the case k=0. In fact, a string in $$\textstyle[m]^n$$ with exactly k hapax entries is determined choosing: 1) the k-subset of [n] corresponding to the positions of its hapax entries, 2) an ordered list of the k entries to be hapax, and 3) the complement string, which has to be a generic hapax-free string of lenght (n-k) with entries choosen from a set of (m-k). This amounts to say that
 * $$A(k,m,n)=\frac{n!\ m!}{k!(n-k)!(m-k)!}\ A(0,m-k,n-k)$$.

Thus the question is: how many hapax-free strings are there in $$\textstyle[m]^n$$? Look at the complement: the set of non hapax-free strings: it's the union of the n sets $$\textstyle X_1, X_2,\cdots, X_n$$, where $$\textstyle X_i$$ is the set of all strings whose i-th entry (at least) is hapax, that is
 * $$X_i:=\{x\in [m]^n : x_j\neq x_i\ \forall j\neq i\} $$.

The union is not disjoint, of course, so you have to use the inclusion-exclusion formula; this means that you need to compute the cardinality of the sets $$\textstyle X_i$$, and also, the cardinality of their r-fold intersections, $$X_{i_1}\cap X_{i_2}\cdots \cap X_{i_r}$$. That however depends only on r, besides n and m, and does not seem to be too hard to compute. If you can get through the computation I am curious to see the final result :) (well, as usual there is no final result) --PMajer (talk) 16:40, 28 October 2008

You should have found
 * $$A(k,m,n)=\frac{n!\ m!}{k!}\sum_{r=0}^{n-k} (-1)^{r}\frac{ (m-k-r)^{n-k-r} }{r!\ (n-k-r)!\ (m-k-r)!} $$.

It is instructive to consider the generating function of the numbers $$\textstyle A(k,n,m)$$, that is
 * $$\alpha(x,y,z):=\sum_{k,n,m} A(k,n,m)\ x^k\ \frac {y^m}{m!}\ \frac{z^n}{n!}$$

(it is a generating series of ordinary form in x and of exponential form in y and  z: it is the best choice here). If you like manipulating power series, you can deduce from the above expression of the $$\textstyle A(k,n,m)$$ that, in fact
 * $$\textstyle \alpha(x,y,z)=\exp(y e^z-yz+xyz)$$.

Indeed, the most efficient way to find the numbers $$\textstyle A(k,n,m)$$ would have been from the other way around: first, write down their generating function (you need a little theory to do it quickly), then deduce the coefficients. To check the validity now we can compute the expected numbers of strings in $$\textstyle[m]^n$$ with exactly  k   hapax entries: that is the mean value $$\textstyle\frac{C(m,n)}{m^n}$$ where $$\textstyle C(m,n):=\sum_{k}k A(k,m,n) $$. Their generating function is easily deduced from $$\textstyle\alpha$$ :
 * $$\sum_{m,n}C(m,n)\ \frac{y^m}{m!}\frac{z^n}{n!}=\frac{\partial\alpha(x,y,z)}{\partial x}\Big|_{x=1}=yz \exp(y e^z),$$

that gives $$\textstyle C(m,n)=nm(m-1)^{n-1}$$, in accordance with the computation by Algebraist. From the generating function you can deduce as well all the other momenta of the distribution for the number of hapax. Moreover, you can find asymptotics of the coefficients, even without computing them explicitly. Last observation: everything holds with no restriction on n and m, as $$m<n$$ was never required. --PMajer (talk) 08:55, 29 October 2008 (UTC)


 * (From the OP, posting from another address) PMajer, I wasn't sure if m < n would make a difference. The questions came from observing a real-life situation, in which m < n. I don't really need the answers to the problems—it's more like a personal curiosity. It'll probably take me some time to follow your suggestions and try to work out the answer. If you have any progress in the meantime, feel free to post it—it won't spoil the fun for me :) Thanks to others who have posted their comments, too. --71.162.249.97 (talk) 12:25, 29 October 2008 (UTC)

Well, don't exhitate to ask, if you need further clarification. However, I've cleaned a couple of misprints and added some formulas. If you wish to try the computation of the generating function $$\textstyle\alpha$$ of the numbers $$\textstyle A(k,m,n)$$, it is perhalps useful to consider first the generating function of the $$\textstyle A(0,m,n)$$ and find:
 * $$\alpha(0,y,z):=\sum_{n,m} A(0,n,m)\ \frac {y^m}{m!}\ \frac{z^n}{n!}=\exp(y e^z-yz)$$.

You can deduce $$\textstyle\alpha(x,y,z)$$ from it using the relations stated at the beginning:
 * $$\frac{A(k,m,n)}{n!\ m!}=\frac {A(0,m-k,n-k)}{k!\ (m-k)!\ (n-k)!}$$,

that are translated into the equation $$\textstyle\alpha(x,y,z)=\alpha(0,y,z) e^{xyz} $$ between their generating series. --PMajer (talk) 13:46, 29 October 2008 (UTC)

Induction of derivative of exponential functions
Simply put, I'm trying to figure out how to prove by induction that given $$y=Ae^{kx}$$, A and k are constants, e is euler's number, that $$d^ny/dx^n = k^ny$$. I've proven $$P_1$$ is true, so by induction, I have to show that $$P_z$$ implies that $$P_{z+1}$$, but that gives me a problem. So I know that $$\frac{d^zy}{dx^z} = k^zy$$; trying to reach $$\frac{d^{z+1}y}{dx^{z+1}} = k^{z+1}y$$. I've gotten to here: $$\frac{d^zy}{dx^z} * \frac{d}{dx} = k(k^z)y$$; how could $$\frac{d}{dx}$$ be shown to equal k? I'm not sure about the notation of $$\frac{d}{dx}$$, is it different from $$\frac{dy}{dx}$$? My textbook says that this induction is correct. Recently edited to add formatting, sorry for the mixup! - Blooooo (talk) 03:28, 28 October 2008 (UTC)
 * How about $$\frac{d^{z+1}y}{dx^{z+1}} = \frac{d}{dx} \left( \frac{d^zy}{dx^z} \right) = \frac{d}{dx} (k^zy) = k^z \frac{d}{dx} y = k^z \cdot ky = k^{z+1}y$$ --169.232.233.134 (talk) 03:55, 28 October 2008 (UTC)
 * As for the difference between $$\frac{d}{dx}$$ and $$\frac{dy}{dx}$$, the former is the differentiation operator, while the latter is the result of that operator operating on $$y$$. The result, that is the derivative, can also be written $$\frac{d}{dx} y$$ or $$\frac{d}{dx} (y)$$. -- Jao (talk) 14:53, 28 October 2008 (UTC)

Combinatorial problem
Suppose I'm given a small collection of $$m$$ vectors with positive entries in $$\mathbb{R}^n$$, where $$m \geq n$$. The goal is to choose precisely $$n$$ vectors $$v_1,\ldots,v_n$$ to maximize
 * $$\sum v_i \cdot e_i$$

where $$e_i$$ is the standard $$i$$th basis vector.

That is to say, I want to sum n different entries in n different vectors, choosing each coordinate only once, and maximize that sum. Does this problem break down easily? I know it has an instantiation as a 0-1 integer programming problem, but there's so much more structure here that I was hoping there would be a better way. Thanks, RayAYang (talk) 04:19, 28 October 2008 (UTC)


 * It seems to be an assignment problem. Make a bipartite graph with a white vertex for each of the m vectors and a black vertex for each of the n coordinate positions. The edge from vector v to coordinate position i has weight v . ei.  Now find a maximum matching from the coordinate positions to the vectors. McKay (talk) 09:00, 28 October 2008 (UTC)
 * Thanks! RayAYang (talk) 14:38, 28 October 2008 (UTC)


 * It sounds awfully like a variant of the Knapsack problem to me. I'd be surprised if you get an efficient solution. Dmcq (talk) 18:30, 28 October 2008 (UTC)

Quick question about solving for factors
Is there a way to solve for x in an un-factorable quadratic equation, such that a non-prime integer z is a factor, without using factorization? For example, solving x^2+x+3 such that it results in a number with a factor of 33 but without knowing that 33=3*11. 128.227.195.92 (talk) 14:19, 28 October 2008 (UTC)poster


 * I don't understand the question. We don't usually consider integers to be factors of polynomials, however if you do, then any integer is a factor (assuming you allow rational coefficients). Also, "x^2+x+3" can't be solved since it isn't an equation, do you mean "x^2+x+3=0"? If so, what do you mean by "results in a number"? None of the roots of that polynomial are multiples of 33 (at least, not integer multiples). Or do you mean to solve x^2+x+3=33k, for some integer k? If you want to do that, just take the 33k over to the other side and use the quadratic formula. If you want the x to be an integer (or, at least, rational) you need to find a k such that the discriminant (b2-4ac=1-4(3-33k)) is a perfect square, however that's impossible in this example since it's always going to be negative resulting in a complex solution. I don't see how knowing the prime factorisation of 33 would help. --Tango (talk) 15:16, 28 October 2008 (UTC)


 * Discriminant is 132k&minus;11. k = 1 gives a square discriminant, as does k = 111. Gandalf61 (talk) 15:39, 28 October 2008 (UTC)
 * You mean -1*-1 doesn't equal -1? Say it ain't so! --Tango (talk) 15:50, 28 October 2008 (UTC)


 * Use the quadratic equation. —Preceding unsigned comment added by Yakeyglee (talk • contribs) 00:01, 29 October 2008 (UTC)