Wikipedia:Reference desk/Archives/Mathematics/2010 June 10

= June 10 =

Dense open set in the reals with finite measure
This is probably a well-known construction; I'm sure I've seen it online before, although I can't find it now.

Let $$a_1, a_2, a_3...\,\!$$ be a bijection between the natural numbers and rational numbers, and let $$\sum_{k=1}^\infty \alpha_k\,\!$$ be a convergent series of positive real numbers. Then the open set $$\bigcup_{k=1}^\infty (a_k-\alpha_k, a_k+\alpha_k)\,\!$$ has a measure bounded by $$2\sum_{k=1}^\infty \alpha_k\,\!$$.

Needless to say, this is pretty heavily counter-intuitive. This set is open, and it is dense over the entire real numbers (specifically, it includes every rational number), so it would seem to have to include everything, and yet it has finite measure, so almost all real numbers are excluded from the set. So my question is: how can I gain an intuitive grasp for how and why this is possible? How do I extend my world-view to allow for this strange notion? --COVIZAPIBETEFOKY (talk) 00:34, 10 June 2010 (UTC)
 * Well, this is just the difference between measure and category. I'm not sure whether you understand it or (per von Neumann) "just get used to it", but for just getting used to it, it might help to have more natural examples.  The set of normal numbers is almost everything from a measure point of view, but almost nothing (it's a meager set) from the point of view of category.  One way to think about the difference is, suppose you were to generate a real number with a loaded 10-sided die, rather than a fair one?  Almost surely, it would not be normal.  But from a category point of view, it doesn't matter much if the die is loaded. --Trovatore (talk) 00:48, 10 June 2010 (UTC)


 * The motivation behind this construction is subsumed within the simple fact that the set of all rational numbers has measure zero. The open set is constructed so that it is "not too much bigger than the rationals"; specifically, by taking the union of small enough open intervals that each contains a rational number, and letting the length of these intervals rapidly tend to zero (of course, as we recall from analysis, the condition that the limit of a sequence tend to zero is not sufficient for the sequence to have a well-defined sum, although it is necessary), we still obtain a set that is dense (it contains every rational number), but still is not "that much bigger" than the rational numbers. PS  T  03:41, 10 June 2010 (UTC)


 * Every interval in your $$\bigcup$$ sum has measure exactly $$2 \alpha_k$$ so the sum measure is no bigger than twice the real series sum 2S (provided the series is convergent to S). If you move those intervals along the real numbers axis to make them adjacent, you would get a line segment 2S long. This is quite intuitive, right? And as there is (countably) infinitely many intervals in the sum, you can shift them so that every rational number gets covered by some of them. And this is also quite intuitive. What is counter-ituitive here is the equipollence of natural and rational numbers sets. --CiaPan (talk) 06:11, 10 June 2010 (UTC)
 * Hmm, no, I think I'm going to have to insist that it's the measure-v-category thing. Even if most people don't think they have any intuitions about category to violate.  Turns out they do!
 * Category is very revelatory of what's going on with normal numbers. I recall someone remarking on sci.math about the difficulty of showing that particular numbers are normal, with a comment about how hard it is "to find a piece of hay in a haystack".
 * But actually, from a category point of view, the non-normal numbers are the hay. Which I argue is exactly why it's hard to zero in on specific normal numbers.  The player trying to pick a non-normal number has a winning strategy in the Banach–Mazur game, where the players pick descending open sets and the outcome is determined by the intersection of all of them. --Trovatore (talk) 07:09, 10 June 2010 (UTC)
 * I don't know if this will apply to the level at which you are considering this, but a common naive way to believe that your set is the whole real line is to think of open sets as "uniformly open". That is, we know that if $$x \in A$$ then for some $$\epsilon>0,\ (x-\epsilon,x+\epsilon)\subseteq A$$. If at some level you feel that the same ε works for every x, the conclusion will indeed be $$A=\mathbb{R}$$. I think the more comfortable you make yourself with the fact that it's a different ε for every x, the less you'll find $$A=\mathbb{R}$$ intuitive. -- Meni Rosenfeld (talk) 08:28, 10 June 2010 (UTC)


 * I think Meni's comment is very apropos, and it's the main reason we're tempted to think that the set you constructed is the whole real line. I think I may be able to say two things to elaborate on it.


 * First: you don't need complicated measure theory to show that the set you constructed is not the whole real line. Let $$\infty > a = \sum \alpha_k$$. If you start with a large enough interval [0,x] so that x > 2a, then compactness shows that if the set you constructed were to cover [0,x] it would have to do so after a finite number of steps. That's impossible because the lengths of the open intervals you subtract sums to 2a, so no partial sum of these lengths can exceed 2a. This argument could be done using only elementary real analysis.


 * Once you accept that the set you make does not cover the real line, the second thing is that the complement has infinite measure. The counterintuitive point here is that a set can have positive measure without containing any intervals. This is usually shown by constructing a "fat cantor set", but your construction works perfectly well, if you take the complement of the set you constructed. Measure of open sets does correspond with the sum of the lengths of the connected components, but for arbitrary sets measure is not like length. &mdash; Carl (CBM · talk) 11:51, 10 June 2010 (UTC)

Thanks, Carl, although I actually already understood both of those things. That certainly explains why this set is possible, although it's not really very satisfying as far as capturing an intuitive notion. I'll probably go ahead and trust von Neumann on this one.

Trovatore, can you explain your measure-vs-category thing more clearly? It doesn't seem to me like you're referring to Category theory, although I may just be missing a subtle connection. Keep in mind I haven't had any formal instruction in any of this; maybe it would be better to wait until I do. I still find it more fun to look up online though. :P --COVIZAPIBETEFOKY (talk) 12:28, 10 June 2010 (UTC)
 * Are you familiar with such notions as "Meagre set", "Baire space" and "Baire category theorem"? Topologists sometimes speak of "a space of the first category" or "a space of the second category". A (topological) space X is said to be of the first category (according to the terminology originally used by R. Baire), if it equals the union of a countable collection of closed sets having empty interiors. Otherwise, it is said to be of the second category. Thus, for example, one can say that a space X is a Baire space if and only if every open set in X is of the second category.
 * Do you mean that you have not had formal training in (axiomatic) set theory or in measure theory? If the latter, you might like the book Measure Theory by Paul Halmos; it is one of the classic texts in measure theory, and is fairly comprehensive (though possibly somewhat outdated). PS  T  13:20, 10 June 2010 (UTC)


 * I think the above answers are quite exhaustive; however, I would add that in order to visualize the thing, you have to take into account both the order of the rationals given by the enumeration and their natural order in the real line. One should perhalps imagine these rationals with their intervals like stars in the sky, so that when you zoom in (like if you were watching them from a spaceship travelling away from the center of the galaxy), new one appears here and there, say $$a_1, a_2, a_3...\,\!$$, in the order of the list, but they have smaller and smaller circles of light around, and the sky appears after all darker and darker.    --pm a  21:27, 10 June 2010 (UTC)

Thanks, guys. I'll see if I can get my hands on that book, PS  T. --COVIZAPIBETEFOKY (talk) 12:34, 12 June 2010 (UTC)


 * On that business about the night sky, Olbers' paradox is that it should be bright. The Big Bang and cosmic microwave background explains it, but one possible explanation could have been a fractal distribution of stars, that could be made to give a distribution of brightness like pma was saying. Dmcq (talk) 15:54, 12 June 2010 (UTC)

how to integrate this function?
iam high school student and iam asking for help, how to integrate this function, f(x)=sin(x)+1, indefinitly?please —Preceding unsigned comment added by 81.25.140.6 (talk) 12:20, 10 June 2010 (UTC)
 * There are two ways of doing it: numerically and analytically. Numerically you compute a huge table of values of f(x) for x = 0.0005, 0.0015, 0.0025, and so on, then the integral $$\int_0^x f(t)dt$$ is approximately equal to the finite sum $$\sum_{n=0}^{999x} f(0.0005+0.001n)\cdot 0.001$$. Analytically you manipulate the differential expression $$f(x)dx=(\sin(x)+1)dx=\sin(x)dx+dx+dC=d(-\cos(x))+dx+dC=d(-\cos(x)+x+C)$$. Then you use that $$\scriptstyle \int d(F(x))=F(x)$$ (including the constant C). Bo Jacoby (talk) 13:09, 10 June 2010 (UTC).
 * The OP mentions "indefinitly" (misspelled) in his post which suggests that he is looking for an antiderivative of the function f rather than the definite integral of f.
 * OP: Recall that a function F is an antiderivative of f (sometimes referred to as an "indefinite integral" of f), if the derivative of F equals f. You can check that in fact $$F(x)=-\cos(x)+x+C$$ has this property where C is an arbitrary constant. Using the integral symbol, we may write:


 * $$\int (\sin(x)+1)dx=\int \sin(x)dx + \int 1dx=-\cos(x)+x+C$$.


 * PS T

13:30, 10 June 2010 (UTC)

Note that you can compute the Riemann sum exactly in this case. Count Iblis (talk) 23:10, 10 June 2010 (UTC)

Subset Sum Input Size
Why is the input size of subset sum measured as the number of elements in the set and not the size of the numbers in the set? In short, why would {-1, 2, 3} be considered the same size as {-100, 256, 385}? Or, does it not really matter either what; namely, does specifying its size by the size of the integers involved not change wheter and algorithm solving it is polytime or exptime? 67.163.183.146 (talk) 14:19, 10 June 2010 (UTC)
 * The relevant section of our article is Subset sum problem, but I can't quite follow the use of P (precision of the problem) there. My impression is that it is not related to approximate solutions, but I don't understand what a large N small P problem would look like. -- 58.147.55.151 (talk) 00:40, 11 June 2010 (UTC)
 * A problem witrh a very large N and small P would have lots of duplicate values. Dmcq (talk) 15:50, 11 June 2010 (UTC)

Combined distribution of Beta distributed variables
Hello,

I have 2 independent binary variables; X & Y.

The $$\theta$$ (prob = true) for each is unknown, and Beta distributed (with known parameters $$\alpha$$ and $$\beta$$).

I am interested in the distribution of the probability that both are true, I think this should be able to be expressed as another Beta distribution, but unsure how to do this.

Thanks! Ironick (talk) 14:23, 10 June 2010 (UTC)
 * Let's write:
 * Given $$p_1$$, $$X_1 \sim \mathrm{Bernoulli}(p_1)\;\!$$
 * Given $$p_2$$, $$X_2 \sim \mathrm{Bernoulli}(p_2)\;\!$$
 * $$p_1 \sim \mathrm{Beta}(\alpha_1,\beta_1)\;\!$$
 * $$p_2 \sim \mathrm{Beta}(\alpha_2,\beta_2)\;\!$$
 * $$Z=X_1X_2\;\!$$
 * $$p_1$$ and $$p_2$$ are independent.
 * Given $$p_1$$ and $$p_2$$, $$X_1$$ and $$X_2$$ are independent.
 * So given $$p_1$$ and $$p_2$$, $$Z \sim\mathrm{Bernoulli}(q)\;\!$$ where $$q=p_1p_2$$.
 * Now the question is - what is the distribution of q? You have:
 * $$\mathrm{Pr}(q\le a) = \mathrm{Pr}(p_1p_2\le a) = \frac1{\mathrm{B}(\alpha_1,\beta_1)\mathrm{B}(\alpha_2,\beta_2)}\int_0^1\int_0^1I(p_1p_2\le a)p_1^{\alpha_1-1}(1-p_1)^{\beta_1-1}p_2^{\alpha_2-1}(1-p_2)^{\beta_2-1} \ dp_1\ dp_2$$
 * Figuring out what this is equal to may take a bit of work, but should be fairly straightforward. I suspect that it won't be a Beta distribution. -- Meni Rosenfeld (talk) 14:53, 10 June 2010 (UTC)
 * Actually, the integral might not be very easy. -- Meni Rosenfeld (talk) 15:13, 10 June 2010 (UTC)
 * Meni, you forgot the normalizing constant. Michael Hardy (talk) 17:34, 10 June 2010 (UTC)
 * Thanks, fixed. -- Meni Rosenfeld (talk) 09:10, 11 June 2010 (UTC)
 * The probability that X=1 is not unknown! It is the mean value of the Beta distribution $$\scriptstyle (\alpha,\beta$$). (Pick first an outcome $$\scriptstyle \theta$$ from the beta distribution, and pick next X=1 with  probability   $$\scriptstyle \theta$$) . The probability that both are true is
 * $$\frac{\alpha_X }{\alpha_X +\beta_X}\frac{\alpha_Y}{\alpha_Y+\beta_Y}$$
 * Bo Jacoby (talk) 15:17, 10 June 2010 (UTC)


 * Thanks for the responses, (unfortunately) I am interested in the distribution of q rather than its expectation...
 * The integral does seem complex, is it simpler to compose a confidence interval? That would be a huge help. Ironick (talk) 15:34, 10 June 2010 (UTC)
 * Maybe. Since p1 and p2 are independent, the moments of q can be reduced to moments of the beta distribution: $$\mathbb{E}[q^m]=\mathbb{E}[p_1^m]\mathbb{E}[p_2^m]$$. There should be a way to find confidence intervals using these.
 * Depending on your application, it may be sufficient to calculate the integral numerically and use that to find the confidence intervals numerically. -- Meni Rosenfeld (talk) 17:04, 10 June 2010 (UTC)

There is no distribution! The probability (using Meni's notation) that $$\scriptstyle X_1=1$$ (or X=true) is simply $$\scriptstyle\frac{\alpha_1 } { \alpha_1 +\beta_1 }$$ where $$\scriptstyle \alpha_1,\beta_1$$ are the known parameters of the beta distribution. Proof: The conditional probability that $$\scriptstyle X_1=1$$ when the parameter $$\scriptstyle \Theta_1$$ assumes the value  $$\scriptstyle \theta_1$$ is simply  $$\scriptstyle \theta_1$$. The probability that $$\scriptstyle \theta_1<\Theta_1<\theta_1+d\theta_1$$ is $$\scriptstyle f(\theta_1;\alpha_1,\beta_1)d\theta_1$$ where  $$\scriptstyle f$$ is the probability distribution function for the beta distribution. The total probability that $$\scriptstyle X_1=1$$ is $$ \scriptstyle \int_0^1 \theta_1 f(\theta_1)d\theta_1$$ which is by definition the mean value. The value of the integral is listed in the article on  beta distribution. Q.E.D. Bo Jacoby (talk) 22:08, 10 June 2010 (UTC).
 * (ec) Observe that Bo has integrated WRT $$p_1$$ and $$p_2$$ to get the answer. The details of this particular integral  are a pleasing consequence of the beta distribution (observe that you can evaluate the integral WRT p1 and p2 as two separate beta functions because of the independence of p1 and p2). Robinh (talk) 07:33, 11 June 2010 (UTC)
 * The OP's original wording was a bit clumsy, but as he later clarified, he did not ask what is the probability that $$X_1=1$$, or that $$X_1,\ X_2 = 1$$. He wanted to know what is the distribution of q, which is a random variable equal to the product of two independent beta-distributed random variables. In fact $$X_1$$ and $$X_2$$ are completely irrelevant to the question, they were used only to explain what the OP was after. -- Meni Rosenfeld (talk) 09:10, 11 June 2010 (UTC)


 * I read the question a bit differently, as implying the probability that X is true and the probability that Y is true are the same, and this probability is drawn once from a Beta distribution, not two separate draws from different Beta distributions as Meni and Bo have assumed. My interpretation gives a Beta-binomial model with n=2. Plugging n=2, k=2 into the expression for the pmf given in that article and simplifying using the properties of the Beta and Gamma functions would give $α(α+1) / [(α+β)(α+β+1)]$ for the probability that both are true. --Qwfp (talk) 09:36, 11 June 2010 (UTC)
 * A beta distributed random variable may assume all real values in the interval from 0 to 1. The product of two beta distributed random variables may also assume all real values in the interval from 0 to 1. The beta-binomial model may only assume discrete values. So the product of two beta distributed random variables cannot be described by a beta-binomial model. Bo Jacoby (talk) 11:51, 11 June 2010 (UTC).

Tent Poles
I have a tent with a broken pole, and am trying to order a new one. The pole itself is sufficiently shattered such that it cannot be measured easily, though I estimate it's length to be approximately 175.5". The pole travels diagonally across a tent with sides of 93" and 82.5" (approximate because it is not an orthogonal structure). Thus it has a base of approximately 124.5". The maximum height of the tent is 51.5". I have decided that the approximate length of the pole (175.5") is consistent with my other measurements, but in figuring this out, I have begun to wonder how to describe the shape made by my pole and how to figure its length in general terms. Is it possible to do so with the base and height, or is the base angle also needed? Furthermore, is this shape a catenary or a parabola? Thanks for indulging me. Incidentally, the tent is a Mountain Hardwear Alcove 3. Tuckerekcut (talk) 23:36, 10 June 2010 (UTC)
 * The exact curve probably isn't a catenary or a parabola, there are variables such as the weight distribution of the tent and how it's initially bent that are difficult to determine. Even if you assume an initially straight pole and ignore the weight of the tent the math gets pretty hairy (see Elastica theory). My advise is to just go with your initial estimate. According to the picture I found on the net, the tent has two poles so can't you assume they're the same length?--RDBury (talk) 04:14, 11 June 2010 (UTC)
 * Right, elastic rod curves are a complex subject. I'd say, if the pole slides through a pocket in the tent fabric, the curve may well be circular or elliptical, since that's easiest to design for a cut of fabric, and the cut of the fabric will determine the curvature. ~Amatulić (talk) 04:49, 11 June 2010 (UTC)
 * A good approximation to the curve is x = a&phi;, y = ab sin(&phi;), 0≤&phi;≤&pi; where a = 124.5"/&pi; = 39.6" and b = 51.5"/a = 1.30. This is because the curvature is proportional to the bending moment which is proportional to the height, (assuming that the ends of the pole are pressed together), and the sine curve satisfies y ' ' = &minus; y. The length of this curve is $$S=2a\int_0^{\pi/2} \sqrt{1+b^2\cos^2(\phi)}d\phi$$. The integral is computed by numerical integration. Bo Jacoby (talk) 05:48, 11 June 2010 (UTC).
 * PS. Simplify the expression by using a trigonometric identity and changing integration variable
 * $$S=2a\int_{2\phi=0}^{\pi} \sqrt{1+b^2\frac {1+\cos(2\phi)}2}\frac{d(2\phi)}2$$
 * $$S=a\int_{0}^{\pi} \sqrt {\frac {2+b^2}2\left(1+\frac {b^2}{2+b^2}\cos(\phi)\right)} d\phi$$
 * substitute k=$$\scriptstyle \frac {b^2}{2+b^2}$$=0.458 and m=$$\scriptstyle a \sqrt {\frac {2+b^2}2}$$=53.8".
 * $$S=m\int_0^\pi \sqrt{1+k\cos(\phi)}d\phi$$
 * The square root is expanded into a series using binomial coefficients
 * $$S=m\int_0^\pi \sum_{n=0}^\infty\binom{\frac 1 2}{n}(k\cos(\phi))^n d\phi$$
 * $$S=m \sum_{n=0}^\infty\binom{\frac 1 2}{n}k^n\int_0^\pi\cos^n(\phi)d\phi$$
 * The integral is zero when n is odd. (See list of trigonometric integrals).
 * $$S=m \sum_{n=0}^\infty\binom{\frac 1 2}{2n}k^{2n}\pi\prod_{p=1}^n\frac {2p-1} {2p}$$
 * $$S=m\pi(1-\frac{k^2}{16}-\frac{15k^4}{1024}-\cdots) $$
 * The first term is a good approximation to the length of your tent pole
 * $$S=m\pi=53.8\pi $$=169"
 * You said it should be 175.5". Check my calculation! (See also Elliptic_integral). Bo Jacoby (talk) 00:15, 12 June 2010 (UTC).