Wikipedia:Reference desk/Archives/Mathematics/2013 May 19

= May 19 =

Reprised Confusion over Wikipedia Definitions of Sigma Additivity
Okay; now that I can actually preview the math that I intend to place inside of my posts before I submit them again (if you care, you can see the details surrounding that fiasco here; otherwise, just ignore my griping and keep reading,) I can actually get some work done. Unfortunately, that still requires me to ask questions about what I am trying to work on, so here goes: the question I need to ask today has to do with how sigma additivity is defined here on Wikipedia, especially because several of its articles seem to have become quite disagreeable on the subject even though they are linked together. Having said this, how does one reconcile these sources' contents into a single definition that I can use when they all disagree? Please notebefore clicking on the links that I have provided to the three sources in question, however, that I have also quoted the text from these articles that I thought you might find most applicable to my conundrum. As such, my first source, the article on probability axioms, defines the concept under its third headingas the following…: "This is the assumption of σ-additivity:
 * Any countable sequence of disjoint (synonymous with mutually exclusive) events $E_1, E_2, \dots$ satisfies
 * $P\left(E_1 \cup E_2 \cup \cdots\right) = \sum_{i=1}^\infty P\left(E_i\right)$.

Some authors consider merely finitely additive probability spaces, in which case one just needs an algebra of sets, rather than a σ-algebra. Quasiprobability distributions in general relax the third axiom."

…whereas my second source, the list of defining characteristics of a measure, states that $σ$-additivity is defined thusly…: "* Countable additivity (or $\sigma$-additivity): For all countable collections $\left\{E_i\right\}_{i \in I}$ of pairwise disjoint sets in $\Sigma$:
 * $\mu\Bigl(\bigcup_{i \in I} E_i\Bigr) = \sum_{i \in I} \mu\!\left(E_i\right)$."

…and my third source, the the section on $σ$-additivity as a property of setfunctions in the article on sigma additivity, defines the concept as follows: "Suppose that $\scriptstyle\mathcal{A}$ is a σ-algebra. If for any sequence $A_1, A_2, \dots, A_k, \dots$ of disjoint sets in $\scriptstyle\mathcal{A}$ one has
 * $\mu\left(\bigcup_{n=1}^\infty A_n\right) = \sum_{n=1}^\infty \mu(A_n)$$,$

we say that $μ$ is countably additive or $σ$-additive."

Before anyone answers my question, though, please let me mention that I would like to point out that, as I will quote below, I have attempted to ask this question before. Here is the text of the conversation in which I did so after attempting to merge the definitions of sigma additivity from my first two sources on my own: "Are the following two definitions of sigma additivity equivalent? If so, how?
 * $\mu\left(\bigcup_{i \in I} E_i\right) = \sum_{i \in I} \mu\left(E_i\right)$ for all countable collections $\left\{E_i\right\}_{i \in I}$ of pairwise disjoint sets in any sample space $Ω$&thinsp;–&thinsp; i. e.: this statement $\forall \left\{E_i\right\}_{i \in I} \iff I \subsetneq \mathbb{N} \land \bigcap_{i \in I} = \varnothing$
 * $ \mu\left(\bigcup_{n=1}^\infty A_n\right) = \sum_{n=1}^\infty \mu(A_n)$ for any sequence of disjoint subsets $A_{n}$ in any sigma algebra $\mathcal{A}$&thinsp;–&thinsp; i. e.: this statement $\forall n$

—&thinsp;RandomDSdevel (talk) 20:13, 2 April 2013 (UTC)
 * I'm not totally sure I follow what the statements are, but I'll have a go. I believe these are inequivalent, but both are arguably incorrect definitions. To start with, by definition any countable set I is basically the naturals. This leaves us with two potential differences: 1) (pairwise) disjointness and 2) the sigma-algebra rather than being a general subset of $\Omega$. The disjointness is a problem because the word pairwise is often omitted from "disjoint" - the only other thing it could mean is that the whole intersection is empty, which in most cases I know of is clearly not what's wanted from context - both definitions are the same. For the second, if your function $\mu$ is defined on all subsets of $\Omega$, then this is the same thing with respect to the sigma-algebra "all subsets"; in general, however, it won't be (think of non-measurable sets). Does that help at all? Straightontillmorning (talk) 15:17, 4 April 2013 (UTC)"

In retrospect, I'm pretty sure that what Straightontillmorning meant by his reply was that the first statement, which I compiled from my first two sources, could not become equivalent to the second one, which I basically copied from my third source after attempting to summarize an explanation of the conditions applied to that definition as provided by the said definition's preceding paragraph, unless I made a couple of changes first. One of these, of course, would be to take his advice and make the second statement work 'for any sequence of pairwise disjoint subsets $A_{n}$ in any sigma algebra $$\mathcal{A}$$' instead of just for disjoint subsets. The other modification that I would have to make, obviously, would have to address my acquaintance's concerns over the fact that the second definition is only equivalent to the first when its domain is restricted from all sequences of pairwise disjoint subsets of the $σ$-algebra $$\mathcal{A}$$ (which one could, in this case, set equal to the power set $$\mathcal{P}\left(\Omega\right)$$ of a sample space $Ω$) to only the sequence of pairwise disjoint subsets of this $σ$-algebra $$\mathcal{A}$$&thinsp;–&thinsp;in this case, $Ω$&thinsp;–&thinsp;that contains all of the possible members of such a sequence. I assume that this is correct, but could you make sure that I understand this problem correctly? Thanks in advance, RandomDSdevel (talk) 23:16, 18 May 2013 (UTC)
 * You are correct that all instances of "disjoint" are meant to imply "pairwise disjoint", and so it might be helpful to make that explicit. Other than that, the three definitions are equivalent.  No other change is required.
 * Straightontillmorning was concerned that your sequence might contain nonmeasurable sets, but the definition avoids this by specifying that the sets are drawn from $$\Sigma$$. --80.109.106.49 (talk) 09:23, 19 May 2013 (UTC)
 * The two statements that you posted in the original thread were not using mathematical symbols in any recognizable manner, thus rendering the two definitions that you posted meaningless (to me). The definitions in the articles are all equivalent to one another.   Sławomir Biały  (talk) 12:04, 19 May 2013 (UTC)
 * Sorry about the confusing symbols, Slawonir; I was, because of my rudimentary understanding of the subject, using the symbols of mathematical logic as shorthand for longer statements to explain the conditions that apply to these statements. Anyhow, could you show me howthese definitions are mathematically equivalent to each other?  And which version would you pick as most readable and, therefore, most easily communicable to other people?
 * Confused,
 * RandomDSdevel (talk) 20:14, 19 May 2013 (UTC)
 * I can report that I was correctly interpreted above. It is obvious that the three definitions are equivalent, the differences are in notation. Specifically: in the first definition, the measure is called P rather than mu; writing a sequence of unions ... is the same thing as writing a union over an index set, and (by the definition of countable) it makes no odds whether we take a countable index set I or the naturals. There is no need to harmonise these definitions any further - particularly, replacing P by mu on the page discussing probability is liable to confuse (P is standard for a probability measure), but using P for mu in the case of a general measure is equally likely to confuse. Does that clarify anything? Straightontillmorning (talk) 21:05, 20 May 2013 (UTC)
 * I understand that the definition of sigma additivity can apply to either a probability measure $P$ or a generic measure $μ$, but I'm still a little confused by the differences in notation between the given equations. Since I'm sort of new to set and measure theories, could you explain to me why and how one could find these differences to be negligible?
 * —&thinsp;RandomDSdevel (talk) 22:26, 20 May 2013 (UTC)
 * Ok. Let's do this in obsessive detail to be on the safe side. To begin with, the definition does not apply to either a probability measure or a "generic" measure - a probability measure is a measure, and the definition is the same. The hypotheses you have listed in the definitions are as follows:
 * Any countable sequence of disjoint (synonymous with mutually exclusive) events $$E_1, E_2, \dots$$
 * For all countable collections $$\left\{E_i\right\}_{i \in I}$$ of pairwise disjoint sets in $$\Sigma$$
 * for any sequence $$A_1, A_2, \dots, A_k, \dots$$ of disjoint sets in $$\scriptstyle\mathcal{A}$$
 * As we said before, disjoint means pairwise disjoint in all three cases (and would be understood as meaning this.) If I is a countable set, by definition there is a bijection between I and {1, 2, 3, ...}; that is, we can write the set as I = {i1, i2, i3, ...}. Hence all three sequences are the same (in the second, writing "a countable sequence E_1, E_2, ..." is more or less redundant.) It clearly doesn't matter whether we call a set A or E or the sigma-algebra Sigma or script A. It only remains to explain that this sigma-algebra requirement is implicit in the first case by use of the word event which is linked earlier in the article to event (probability theory).
 * and the three conclusions are:
 * $$P\left(E_1 \cup E_2 \cup \cdots\right) = \sum_{i=1}^\infty P\left(E_i\right)$$.
 * $$\mu\Bigl(\bigcup_{i \in I} E_i\Bigr) = \sum_{i \in I} \mu\!\left(E_i\right)$$.
 * $$\mu\left(\bigcup_{n=1}^\infty A_n\right) = \sum_{n=1}^\infty \mu(A_n)$$
 * Using the same statements about how we can treat I as {1, 2, 3, ...}, and the difference between A and E is negligible, and observing that the definition of the dots in $$E_1 \cup E_2 \cup \cdots$$ is just "take the union of the whole sequence", i.e. is $$\bigcup_{n=1}^\infty E_n$$, these are all obviously the same.
 * I hope that's clear. Straightontillmorning (talk) 21:25, 21 May 2013 (UTC)

Hey, guys; I'm sorry that I haven't replied to this thread even though I know that I should have done so due to the fact that I'm the person who started it, but I've sort of been mulling over exactly how to continue this discussion. I've been having a little trouble figuring out exactly what you said, Straightontillmorning, even though it was easy to understand at first glance. I think that the problem is that I couldn't exactly bring myself to formulate the questions that I wanted to ask you (they were on the tip of my tongue&thinsp;–&thinsp;or rather, my fingers, I guess….) Regardless, however, of the trouble that I temporarily had in communicating with you, I think that I'm ready to discuss your comments. First of all, I comprehend that you're trying to tell me that I phrased my statement that 'the definition of sigma additivity can apply to either a probability measure $P$ or a generic measure $μ$' incorrectly and should instead have made clear that I then understood that the definition of sigma additivity applies to both probability measures $P$ and general measures $&mu;$ because the former are defined in terms of the latter, albeit with the extra condition that a probability measure must output a result between 0 and 1. As such, the definitions are, as you said, almost identical. Second of all, are the 'hypotheses' that you list as bullet points below your first paragraphs simply summaries of the conditions applied respectively to each of the definitions that I took from the articles that I referenced? If so, then could you please explain to me what exactly the difference is between a sequence, collection, a set of pairwise disjoint sets, a multiset such as the kind that one must derive from such groups of elements if some of these elements are identical but must not be allowed to collapse into a single, merged element, or an indexed family of sets? Thirdly, I understand that when you mention a 'countable sequence' that you mean that one would use a finite index set to index the sequence in question. Fifth, I get that one may use either any capital letter $$A, B, \dots, Z$$ or any subscript attached to the letter 'E'&thinsp;–&thinsp;as in $$E_1, E_2, E_3, \dots, E_n$$&thinsp;–&thinsp;to denote an event and either the Greek letter Sigma $&Sigma;$ or a calligraphic letter such as $$\mathcal{A}, \mathcal{B}, \mathcal{C},$$ et cetera, to denote a sigma algebra. And, sixth and finally, I find it preposterous that one could equate the arbitrary, but still finite, union used in the first two definitions with the infinite union used in the last definition. I hope that that helps you understand why I'm still a little confused.

Sincerely, RandomDSdevel (talk) 19:28, 27 May 2013 (UTC)
 * Ok. Second of all, yes. The difference between a sequence, collection, set, multiset, and indexed family is technical and not really germane to the discussion. I'll run through it briefly. Collection and set mean the same thing except for some technicalities to do with sets not being allowed to be too big (the set of all sets leads to a paradox in short order.) A sequence and an indexed family are the same thing: a sequence is a family indexed by the set of natural numbers. A multiset is exactly what you said it was - sets are defined by their elements, so {2, 2} is the same set as {2}. You introduce the notion of multiset to prevent this, but it isn't relevant to the question of sigma-additivity. (end of digression)
 * The main point seems to be that, as countable says, a countable set need not be finite (the natural numbers are a perfectly permissible subset of the natural numbers.) In particular, the union in the first two definitions is not necessarily finite either. If my notation was confusing about this, I should perhaps make clear that {1, 2, 3, ... } means that set of natural numbers. Is that the point of confusion? Straightontillmorning (talk) 21:35, 27 May 2013 (UTC)

Parzen density estimate conditional variance
A Parzen estimator for a bivariate probability density given data points $$(x_1,y_1),...,(x_n,y_n)$$ is $$\frac{1}{n}\sum_{k=1}^n \mathcal{N}((x_k,y_k),\sigma^2I)$$.

I believe I can calculate the conditional distribution and mean. If you denote $$f(x,x_k) = (2\pi)^{-1/2}\sigma^{-1}e^{-(x-x_k)^2/2\sigma^2}$$, the estimated probability density becomes

$$p(x,y) = \frac{1}{N}\sum_{k=1}^n f(x,x_k)f(y,y_k)$$

so

$$p(x|y) = \frac{p(x,y)}{p(y)} = \frac{\frac{1}{N}\sum_{k=1}^n f(x,x_k)f(y,y_k)}{\frac{1}{N}\sum_{k=1}^n f(y,y_k)}= \frac{\sum_{k=1}^n f(x,x_k)f(y,y_k)}{\sum_{k=1}^n f(y,y_k)}$$

$$\mathbb{E}[x|y] = \frac{\sum_{k=1}^n x_kf(y,y_k)}{\sum_{k=1}^n f(y,y_k)}$$

Can anyone calculate $$var[x|y]$$? AnalysisAlgebra (talk) 11:17, 19 May 2013 (UTC)

Convergent or Divergent ?
Let $${\color{white}.} \quad F(n) = \int_0^\infty{e^{-f_n(x)}}dx\ ,\quad {\color{white}.}$$ where $${\color{white}.} \quad f_n(x) = \int_0^x{e^{t^n}}dt\ .$$

$$\lim_{n \to \infty}\ F(n) =\ ?$$

We know that F(0) = 1/e, and that for n > 0 the graphic of F(n) is strictly increasing... but I can't tell whether this ever-decreasing strictly-positive growth is asymptotic, or —on the contrary— whether it is similar to that of $$\scriptstyle \sqrt n$$ and Ln (n), which do NOT converge at $$\scriptstyle\infty$$ ( despite having an ever-decreasing growth ). The only question would be: to which of these two similar-looking-but-completely-different categories does our function belong ? And if it does fall into the former category, what exactly is its limit, and does this limit have a closed form ?

Furthermore, in the case that it is convergent, does the following integral also converge ? And if so, then to what value, and does this value possess a closed form ?
 * $$\int_0^\infty{[\ \lambda - F(n)\ ]}\ dn =\ ?$$

where $${\color{white}.} \ \lambda = \lim_{n \to \infty}F(n) \ {\color{white}.}$$ — 79.113.240.146 (talk) 18:01, 19 May 2013 (UTC)


 * Calculate the pointwise limit of $$e^{-f_n(x)}$$. We have
 * $$\lim_{n\to\infty} e^{-f_n(x)}=\begin{cases}e^{-x}&0< x< 1\\ 0&x > 1\end{cases}$$
 * So if interchanging the limit and integral sign were justified, we would get $$\lim F(n) = 1-1/e$$. But interchanging the limits can be justified by dominated convergence using the uniform bound $$e^{-f_n(x)}\le g(x)$$ with
 * $$g(x) = \begin{cases} e^{-x}& 01\end{cases}$$
 * -- Sławomir Biały (talk) 18:43, 19 May 2013 (UTC)


 * Actually, it's even easier to see that the interchange of limits is justified using Lebesgue's monotone convergence theorem. On $$(0,1)$$, the sequence $$f_n(x)$$ is decreasing, and on $$(1,\infty)$$, it is increasing.   Sławomir Biały  (talk) 19:01, 19 May 2013 (UTC)

→ The value of F around 3.4- seems to be more than 0.66+, which itself is more than 1 - 1/e. — 79.113.240.146 (talk) 19:36, 19 May 2013 (UTC)
 * I see no reason to think that the sequence is monotone (and 0.66 is not far from 1-1/e). Where n is small, changes in F(n) are dominated by the $$01$$ part.  (Differentiate with respect to n to see this.)  So it's more reasonable to think that the sequence increases and then decreases.   Sławomir Biały  (talk) 20:04, 19 May 2013 (UTC)
 * You mean $${\color{white}.} \ F'(n) = - \int_0^\infty{f'_n(x) \ e^{-f_n(x)}}dx\ ,\ {\color{white}.}$$ where $${\color{white}.} \ f'_n(x) = \int_0^x{t^n \ln(t) \ e^{t^n}}dt\ {\color{white}.}$$ ? — 79.113.240.146 (talk) 20:55, 19 May 2013 (UTC)
 * Yes. Large n, the contribution from the integral in the $$0<x<1$$ regime is negligible (because of the additional $$t^n$$).   Sławomir Biały  (talk) 21:10, 19 May 2013 (UTC)
 * But this would then imply the existence of a point of maximum when f 'n (x) = 0, whose value I don't know how to calculate. :-( — 79.113.240.146 (talk) 21:29, 19 May 2013 (UTC)
 * Well, I don't think you should expect to be able to calculate it explicitly since the integral is very non-trivial for finite positive n. I was just suggesting a plausible line of reasoning why your sequence is ultimately decreasing rather than increasing.   Sławomir Biały  (talk) 22:21, 19 May 2013 (UTC)

Minimum value of difference of Powers of Integers
The actual question is:
 * What is the minimum value of $$36^m - 5^n$$ where m and n are different. The most likely guess is $$36^1 - 5^2$$ or 36 - 25 or 11 is the solution. Is it right or can the expression have a value less than 11. I want to learn the method of solution and trying to generalize it. Any help appreciated. Solomon7968 (talk) 18:55, 19 May 2013 (UTC)


 * The answer 11 is correct. Modulo 6, $$36^m-5^n$$ is either $$\pm 1$$ depending on whether n is even or odd.  This means that the only possibilities less than 11 are $$1,5,7$$.  Clearly 5 is not possible on divisibility grounds.  7 is not possible because the digits of a power of 5 written in base 6 must add to a multiple of 5, but the base 6 digits of $$36^n-7$$ are $$55\dots545$$.  Finally, 1 is not possible since $$36^n-1$$ is divisible by 7 (and so can't be a pure power of 5).   Sławomir Biały  (talk) 19:24, 19 May 2013 (UTC)
 * I'd just like to say that is a sweet solution.Naraht (talk) 04:30, 24 May 2013 (UTC)