Wikipedia:Reference desk/Archives/Mathematics/2011 April 18

= April 18 =

Constructing an uncountable ordinal using neither axiom of choice NOR replacement
Using the axioms of Zermelo-Fraenkel set theory, except for choice and replacement (i.e. using specification and pairing as substitutes), is it possible to construct the first uncountable ordinal? Hartogs number discusses how to do this with replacement, but is it possible without it? Thank you! JamesMazur22 (talk) 14:36, 18 April 2011 (UTC)
 * I remember reading in Munkre's "Topology" Zermelo proved there's an uncountable well ordered set (first chapter exercises) without the AC. I'm pretty sure he didn't use replacement because Skolem (or Fraenkel or whatever) introduced that axiom way later. Dont have the book with me you can check it out. Money is tight (talk) 16:04, 18 April 2011 (UTC)
 * That doesn't mean you necessarily get uncountable ordinals, though. Vω+ω is a model of Zermelo set theory (Z), so Z doesn't prove the existence of any uncountable von Neumann ordinals. I expect if you actually wanted to work with Z, you'd abandon the von Neumann definition and use Scott's trick to define the ordinal of a well-ordering to be the set of all orderings isomorphic to it of least possible rank. Algebraist 08:13, 19 April 2011 (UTC)
 * Well, to the extent that you need ordinals to be sets, you can do that. It's overkill for some applications, where you don't need ordinals to be official objects of the formal theory.  For example, if you want to prove the wellordering theorem in ZC, you just work with the partial wellorders themselves and show you can make them cohere; you have ordinals in mind but you don't need any fixed set representing the order-types.  (Or at least I don't think you do &mdash; it's been many years since I tried to think through the proof in detail.) --Trovatore (talk) 08:24, 19 April 2011 (UTC)
 * Thanks, this gives me enough to think about. JamesMazur22 (talk) 17:13, 20 April 2011 (UTC)

Generating values from a multivariate normal distribution
I'm trying to understand the procedure described in Multivariate normal distribution. It says that if the covariance matrix is positive-definite, you should use a Cholesky decomposition, but if it is only positive-semidefinite, you need to use an eigendecomposition instead. As a trivial example, I am using the covariance matrix $$\begin{pmatrix} 1 & 1 \\ 1 & 1 \end{pmatrix},$$ which is only positive-semidefinite. Ignoring this, using the Cholesky method (by hand, with the first algorithm described in Cholesky decomposition) seems to work fine. The only problem I can see is that you need to take the square root of a value that is zero, which due to round-off error might be slightly negative. It looks like the eigendecomposition method runs into exactly the same problem: you end up needing to take the square roots of the eigenvalues, one of which is zero. Obviously, in both cases, you can simply truncate the square root to zero if you run into a negative value to avoid this issue.

So I don't really see what using the eigendecomposition method in the positive-semidefinite case gets you. Do you run into different problems doing Cholesky with larger non-positive-definite covariance matrices, or are other Cholesky algorithms more problematic? If so, presumably the reason why you use Cholesky in the positive-definite case is because it is more efficient - does anyone know how big the difference is?

Finally, is this the best/only way of generating values from the distribution, or are there completely different methods? I'm surprised by how little information there seems to be on the web about this - though my knowledge of statistics is very limited, so I might be choosing the wrong search terms. 81.98.38.48 (talk) 14:41, 18 April 2011 (UTC)


 * It seems like the Cholesky decomposition requires positive-definiteness. The issue isn't taking the square root of zero, but dividing by zero as far as I can see.  If the covariance matrix has a zero eigenspace, then the normal distribution will be supported on a lower-dimensisonal subspace.  With the covariance matrix you give, it's supported on the line $$x=y$$.  I'm not sure what computational method I would recommend.  Naively, without worrying about the numerics, I would find an orthonormal basis for the orthogonal complement of the kernel of the covariance matrix, assemble these as columns of a matrix Q, and then let
 * $$\Sigma'=Q^T \Sigma Q$$
 * Work out the distribution for the new covariance matrix $$\Sigma'$$ using whatever numerically stable algorithms you want since $$\Sigma'$$ is now a decent covariance matrix, and then dump it back into the original space via
 * $$X\to QX.$$
 * For what it's worth... Sławomir Biały  (talk) 15:14, 18 April 2011 (UTC)


 * Thanks for the help. The division by zero didn't come up in the particular case of a two-by-two matrix of ones, but I suppose it would in other cases. In fact, the Cholesky decomposition article mentions that it can be extended to positive-semidefinite matrices, but then the factorisation is not generally unique - presumably division by zero occurs where it isn't, and the matrix I picked is a special case where it is unique. Thinking about it, the eigendecomposition should always work (covariance matrices are always symmetric and therefore diagonalizable, and always have nonnegative eigenvalues), so the procedure in the article probably does make sense (though I suppose, to decide whether your matrix is positive-definite, you need to try the Cholesky decomposition anyway - it probably makes sense to go straight to the eigendecomposition if a large proportion of your matrices aren't positive-definite). 81.98.38.48 (talk) 16:54, 18 April 2011 (UTC)

Parameterization
If I have an implicitly defined function, is there a general way to find a parameterization for it? Also, what are some I can do by hand with the assistance of a normal graphing calculator (but not something like a computer program or WA); if this means the functions I can parameterize will be restricted fine but don't forget to specify the restrictions! Thanks. 72.128.95.0 (talk) 16:42, 18 April 2011 (UTC)


 * The short answer is almost always no. You have a map ƒ : Rm → Rn, where n < m. If a point y in the image Rn is a regular value of ƒ, then the level-set
 * $$ f^{-1}(y) := \{ x \in \mathbf{R}^m : f(x) = y \} \, $$
 * will be a smooth manifold. By the implicit function theorem, there exists a local parametrisation for ƒ−1(y) in a neighbourhood of a (regular) point x of ƒ−1(y). There are two problems. First of all, it's very difficult to find such a local parametrisation. We know that one exists, but finding it often very difficult. Secondly, there is no reason for there to be a global (regular) parametrisation. For example, the sphere has no such parametrisation. (Take, for example, the unit circle in the xy-plane, and then rotate about the z-axis. This covers the sphere, but it's not a regular parametrisation; the North and South Poles are critical values.) The existence of a global parametrisation is related to the topology of the level set ƒ−1(y). A general proof involves the Poincaré-Hopf theorem. — Fly by Night  ( talk )  17:59, 18 April 2011 (UTC)

Counting
Okay, so here is a counting question (and yes this is HW but I don't even know how to really begin on this one). How many n-tuples $$(x_1, x_2,..., x_n)$$ are possible if the coordinates are required to be integers satisfying $$0\leq x_i<q$$ and the order of the coordinates doesn't matter? For example, (0,0,1,3) and (1,0,3,0) are considered the same. Any hints? Thanks! 67.40.130.94 (talk) 23:50, 18 April 2011 (UTC)


 * At least try! As the big grey box says: "If your question is homework, show that you have attempted an answer first, and we will try to help you past the stuck point. If you don't show an effort, you probably won't get help. The reference desk will not do your homework for you." — Fly by Night  ( talk )  01:07, 19 April 2011 (UTC)

I know but I just didn't write everything here. Its part of a bigger problem and I have done all the other variants. If the order does matter, then we just have permutations. Its simply (floor(q))^n. If the bounds for each coordinate $$q_i$$ are different (and the order still matters), then its just the product of all the floors of each $$q_i$$. But if the order doesn't matter, then I know that instead of a permutation, it will just be combinations. I also realize that if any of the elements repeat, then using the binomial coefficients will overcount so I have to divide by an appropriate amount. My confusion is how to merge these two together to get this problem. A priori I don't know how many elements are repeating so I don't know what to divide by. And I never asked for just the solution. I asked for a hint...a nudge in the right direction. I don't expect anyone to do my HW for me here. This is not my first time here on wikipedia. 67.40.130.94 (talk) 02:35, 19 April 2011 (UTC)
 * If order doesn't matter then you're really counting non-decreasing sequences. So try looking at the differences instead of the numbers themselves. Also, you might try making a table of values to discover the pattern. Then it's a matter of proving a given result instead of deriving one.--RDBury (talk) 04:01, 19 April 2011 (UTC)