Talk:Cholesky decomposition

Lack of definition of in Monte Carlo example
In the Monte Carlo example application of Cholesky decomposition, \rho is never explained — Preceding unsigned comment added by Matrixalgebrauser (talk • contribs) 04:45, 28 February 2014 (UTC)

Rank one Update and Downdate
These two terms are not defined anywhere in Wikipedia, and searching on the web turns up few references. This terminology is not common, and should be explained.Eigenguy (talk) 01:04, 25 July 2012 (UTC)

Hermitian property
Shouldn't A be symmetric or Hermitian ?


 * A is positive definite, and all positive-definite matrices are Hermitian (see positive-definite matrix). I hope this concise explanation suffices. -- Jitse Niesen 14:13, 26 Apr 2005 (UTC)


 * Is the "all PD matrices are Hermitian" fact available in Wikipedia? I don't know who wrote the original question, but to be fair, the Wikipedia definition: positive-definite matrix gives only the properties that a Hermitian matrix (M) must have to be PD, but doesn't explicitly say than a non-symmetric or non-hermitian matrix can't be PD. Ideally, this would be stated or derived among the 'properties' listed on the positive-definite matrix page, rather than here on a talk page.

Wragge 15:17, 2005 Apr 26 (UTC)
 * (In case there is any confusion, not all Hermitians are PD, so this page, Cholesky decomposition, is fine as it is.)


 * As I feared, my explanation was not well formulated. What I meant to say is that a positive-definite matrix is by definition a Hermitian matrix A such that v*Av > 0 for all nonzero vectors v. So, it is built into the definition that Hermitian matrices are positive definite. You (Wragge) are right that this is not stated explicitly in the article positive-definite matrix.


 * The question which Wragge probably wants to see answered is: if you know that v*Av > 0 for all nonzero vectors v, does it follow that A is Hermitian? If we are restricting ourselves to the reals, the answer is no: there are real matrices A which are not symmetric, but which still have vTAv > 0 for all nonzero real vectors v. I need to check what happens for the complex case; one complication is that v*Av is not necessarily real if A is not Hermitian.


 * I'll check some books when I have time to make sure I've got the details right and rewrite positive-definite matrix if necessary. -- Jitse Niesen 16:57, 26 Apr 2005 (UTC)

PD MUST BE HERMITIAN IN DISPUTE
Thanks for the simple explanation Jitse, I wasn't sure whether all matrices, A, where v*A.v > 0 were Hermitian or not.

Isn't that the general definition of PD though? Is it specified that A must also be Hermitian to qualify as PD? Based on my hazy education, and what I see on Wolfram Research, and Computation Methods in Linear Algebra I am forced to dispute this definition.

Everything I've seen specifies simply the first condition (the multiplication property) and not the Hermitian property (Planet Math fudges the issue by saying that "If A is a square Hermitian then ... is it a positive definite matrix", without saying that this is the non-Hermitians with the same property are not. I have seen no definitions explicitly specifying this condition.) For instance, the definition Google uses, from University of Waterloo simply says :
 * Positive definite matrix (A). x'Ax > 0 for all nonzero x.

Importantly, Computation Methods in Linear Algebra Proposition 6.2.2 clearly implies:
 * 1) Being Hermitian is not a condition for a matrix to be PD (because it states that both conditions must be fulfilled for Cholesky factorization to succeed they must be logically independent).
 * 2) Therefore, the Cholesky decomposition page must make this condition (symmetry/Hermitian) explicit, as the original questioner requested

I conclude that the original questioner raised a useful & correct gap in the definition, but if I have made a mistake, It would be interesting to see where. Can you tell me why you (Jitse Niesen) think that Matrices, A, where all non-zero v vectors give: v*A.v > 0 must also be Hermitian to qualify as a positive-definite matrix?

I only work with symmetric matrices, and most programs and references refer to "Positive Definite Symmetric" as a subset of "Positive Definite", implying that there is no such thing as "Asymmetric Positive Definite Matrix". If such a thing does exist, then some people (me anyway) would be interested in a topic or heading with that name, and an example. It would be one of the few on the Internet.

Also, I am not a pure mathematician, so please forgive any basic errors in understanding that I may have made. I don't know if this is relevant, but the structure of "Definite bilinear form" has the same "if Hermitian then PD" form, without saying "iff Hermitian then PD". Even if this is right, I think it should be cleared up.

- Wragge 22:19, 2005 Apr 28 (UTC)


 * I pulled some books from the shelf and they disagree on this issue. Most use a formulation like "A Hermitian matrix A is positive definite if v*Av > 0," which is ambiguous. Some imply further on that they consider being Hermitian to be a condition for a matrix to be positive definite and some imply that Hermitian and positive definite are independent properties. So, I think it is safest to stress in this article that we need A to be both Hermitian and positive definite. I wrote a small discussion over the relation between positive definite and Hermitian at Positive-definite matrix. -- Jitse Niesen 21:33, 29 Apr 2005 (UTC)

Thanks, Jitse. The Positive-definite matrix is a good summary, with a nice example. I think the reason for the ambiguity is that non-Hermitian matrices have so few applications that the literature concentrates on SPD matrices. Anyway, I don't know enough to take this any further, perhaps it's something to watch out for, if you are looking through textbooks - Wragge 21:47, 2005 Apr 29 (UTC)

The semidefinite case
Question:

The page says that only a positive-definite matrix may be factorized as seen. Are positive-SEMIdefinite matrices excluded from this factorization?
 * I added a paragraph to address this question. -- Jitse Niesen (talk) 04:37, 10 May 2006 (UTC)


 * In that case, the Cholesky–Banachiewicz algorithm has a flaw for semidefinite matrices. To compute $$L_{i,j}$$ below the diagonal, it divides by $$L_{j,j}$$, which could be zero in the semidefinite case. I have seen code that checks for this and just sets the cell to zero, but I am not terribly confident that this is the right way to handle it. Anyone know? --Dave.rudolf (talk) 18:32, 24 January 2011 (UTC)


 * Quick search found the article "Analysis of the Cholesky Decomposition of a Semi-deﬁnite Matrix" analyzing the numerical stability of Cholesky decomposition for symmetric positive-definite matrices with pivoting. I only read the beginning, but it refers to LINPACK routine SCHDC from 1979 which does it, so algorithms for the semidefinite case are not new (the numerical stability analysis article is from 2008, though!). It would be nice if the Wikipedia article treated the semidefinite case as well. -- Coffee2theorems (talk) 10:29, 28 July 2012 (UTC)


 * I believe that the proof given for the positive semi-definite case has a flaw, and I think several simplifications can be made in other areas. First, regarding the part which says "by property of the operator norm", I do not see which property is being used, and in fact I believe that the inequality is the opposite way. Second, to show $$A = LL^*$$, we do not need to go through inner products.Suncup224 (talk) 10:36, 30 December 2013 (UTC)


 * The introduction of the article says positive-definite, the statement paragraph state positive-semidefinite. We should be consistent with the definition Raffamaiden (talk) 16:02, 1 May 2013 (UTC)


 * I checked three references and all of them define initially the Cholesky decomposition only for positive-definite matrices and they're not clear on whether the name also applied to positive-semidefinite. I changed the statement paragraph to be closer to the references. -- Jitse Niesen (talk) 15:20, 2 May 2013 (UTC)

The real matrix vs. complex matrix decomposition
I belive the formulas given for the decomposition relate only to real (real SPD) case. I think they should be generalised to the complex Hermitian positive-definite case. Than they will look like that:

$$l_{ii} = \sqrt{a_{ii} - \sum_{k = 1}^{i - 1}{|l_{ik}|^2}}$$

$$l_{ij} = \frac{1}{l_{jj}} \left( a_{ij} - \sum_{k = 1}^{i - 1}{l_{ik}\bar{l}_{jk}} \right)$$

One more question came to my mind...Why do we consider only positive-definite matrices in complex case??? There exists square root for any complex number (not unique however). So we should be able to use above formulas for any Hermitian matrix (not necessarily PD). Of course the diagonal elements of L would not be positive reals anymore.

What is more, we should be able to use the original formulas for symmetric real case in symmetric complex case. So why do we care about positive-definitness???

--A-lbi 09:49, 22 August 2007 (UTC)

Possible mistake?

 * $$ l_{i,i} = \sqrt{ a_{i,i} - \sum_{k=1}^{i-1} l_{i,k}^2 }. $$

Are we sure $$\sum_{k=1}^{i-1}$$ is right? For the first value of the matrix we have i=1, so:
 * $$ l_{1,1} = \sqrt{ a_{1,1} - \sum_{k=1}^{0} l_{1,k}^2 }. $$

So k goes from 1 to 0? Also, the next value is (i = 1, j = 2):


 * $$ l_{i,j} = \frac{1}{l_{j,j}} \left( a_{i,j} - \sum_{k=1}^{j-1} l_{i,k} l_{j,k} \right), \qquad\mbox{for } i>j. $$


 * $$ l_{1,2} = \frac{1}{l_{2,2}} \left( a_{1,2} - \sum_{k=1}^{0} l_{1,k} l_{2,k} \right)$$

Again the sum seems strange. —Preceding unsigned comment added by 84.223.150.164 (talk) 21:54, 27 March 2008 (UTC)


 * By convention,
 * $$ \sum_{k=1}^{0} l_{1,k}^2 = 0, $$
 * see empty sum. So, for the (1,1) entry, we get
 * $$ l_{1,1} = \sqrt{ a_{1,1} - \sum_{k=1}^{0} l_{1,k}^2 } = \sqrt{a_{1,1}}. $$
 * Indeed, if we look at a 1x1 matrix [a], then its Cholesky decomposition is [a] = [l]* [l], so l is the square root of a. So at first sight everything seems fine. -- Jitse Niesen (talk) 22:23, 27 March 2008 (UTC)

Pivot?
As someone who doesn't know much about mathematics, could there be a definition (or a link to a definition) on what is meant by pivot in the sentence "Any square matrix A with non-zero pivots can be written as the product of..."? Please forgive my ignorance but hey - that's why I'm reading the article in the first place ;-) —Preceding unsigned comment added by 158.234.250.71 (talk • contribs) 16:30, 1 July 2008

Bump for this one: it would be useful to have an description of the pivoting procedure. — Preceding unsigned comment added by 128.125.86.205 (talk) 20:35, 5 August 2013 (UTC)

BigO of Cholesky vs LU?
I added a numerical recipes cite in the first paragraph and changed the sentence to read "roughly half" instead of "half [as many operations]". It is still possible that Cholesky decomposition requires literally half as many operations but we should cite a source before saying so. I found several other sources that imply it is but that never say specifically. Jason Quinn (talk) 13:07, 29 January 2009 (UTC)


 * Trefethen and Bau (Numeical Linear Algebra, 1997) say it is 2/3 m3 flops for LU (pp 152) and 1/3 m3 flops for Cholelsy. PDBailey (talk) 00:24, 30 January 2009 (UTC)


 * They ignore lower order terms, though. I think it would be quite a coincidence if the number of operations were exactly halved. -- Jitse Niesen (talk) 13:36, 30 January 2009 (UTC)


 * You should add the Trefeten reference to the article. Probably a large segment the of the readership of this article will be interested in the big-O order. Jason Quinn (talk) 17:20, 30 January 2009 (UTC)


 * How does it make any sense to talk about FLOPS? How many steps an algorithm takes has nothing to do with either floating point operations nor how many are performed per second. Giving the order of the number of steps in Landau notation doesn't really help as a comparison to LU decomposition either.Steve Checkoway (talk) 23:11, 24 January 2011 (UTC)

The Cholesky-Banachiewicz and Cholesky-Crout algorithms ??
The hyphenated terminology -Banachiewicz and -Crout does not exist in numerical analysis. The use of these phrases is an attempt by someone to create a historical reference that is not appropriate. I sympathize with whoever was the author, because Banachiewicz and Crout did have something to do with solving linear systems and deserve more credit. However. if that history is wanted, a proper history section should be added rather than the made-up names. :-) Jfgrcar (talk) 02:21, 26 September 2009 (UTC)

Hermitian case when not pos-definite
With some web searching, I found many people asking why we need the positive definite restriction in the complex (Hermitian) case, but didn't find anyone attempting an answer. In the real case, the algorithm gives us a way to detecting non-positive-definiteness, but how do you do that in the complex case?

I notice that in the complex case, you end up with:

$$l_{ii} l_{ii}^* = a_{ii} - \sum_{k = 1}^{i - 1}{|l_{ik}|^2}$$

If the right-hand side is not positive real, I don't think you can find $$l_{ii}$$. Notice that it isn't a straight square root, because the diagonal term times its conjugate must be equal to the right-hand side, as opposed to the term times itself.

Do either or both of these conjectures hold (when a is Hermitian)?:
 * if a is not positive definite, then $$a_{ii} - \sum_{k = 1}^{i - 1}{|l_{ik}|^2} \le 0$$ for some i.
 * If $$a_{ii} - \sum_{k = 1}^{i - 1}{|l_{ik}|^2} \le 0$$ for some i, then a is not positive definite. —Preceding unsigned comment added by 67.160.223.41 (talk) 23:02, 13 December 2009 (UTC)

Upper triangular
I know that many built in functions into programs to compute this decomposition use an upper triangular formulation. Although lower triangular is definitely more classic, isn't the upper triangular version ie. XtX = UUt also a cholesky decomposition MATThematical (talk) 23:18, 14 March 2010 (UTC)

Intuition?
I understand intuition of eigendecomposition of a Hermetian matrix (for a covariance, it's the variances in principal directions; for a coordinate transformation it is similar). What is the intuitive, physical meaning of Cholesky decomposition? There must be some interpretation...? —Ben FrantzDale (talk) 05:05, 19 April 2010 (UTC)

I hope (as usual) that someone can correct me if I've got the wrong idea, but I think that roughly speaking Cholesky decomposition is from a conceptual-geometric viewpoint an obfuscated version of Gram-Schmidt orthonormalization. With vanilla Gram-Schmidt you orthonormalize a non-standard basis with respect to the standard inner product, whereas with Cholesky decomposition you orthonormalize the standard basis with respect to a non-standard inner product. The matrix being decomposed is that non-standard inner product, and the decomposition matrix is the operator transforming between the orthonormalized basis and the original standard basis.

The more general version of Gram-Schmidt orthonormalization that allows both a non-standard basis and a non-standard inner product specializes either to vanilla Gram-Schmidt (by taking the inner product to be standard) or to Cholesky decomposition (by taking the basis to be standard).

(As requested I'm only trying to address the intuitive picture here, rather than the gritty computational details which might motivate the choice of one algorithm over another in practical situations.)

There's a Math Stack Exchange question on "Cholesky Decomposition and Orthogonalization" with answers from Victor Liu that I hope roughly agree with my description here, although I didn't completely decipher their notation. The wikipedia article on Gram-Schmidt also mentions Cholesky decomposition but again I didn't decipher their full explanation. 68.196.182.19 (talk) 16:35, 29 August 2019 (UTC)

for MC simulation
wouldn't you need to the square root matrix of the inverse of the Hessian, not the square root of the Hessian? 151.200.188.223 (talk) 06:42, 26 February 2011 (UTC)

The cholesky algorithm
In the explanation of the cholesky algorithm, I think the part that relates A(i) to A(i+1) is impossible to understand without knowing how B(i+1) is related to B(i). It feels like something is missing — Preceding unsigned comment added by 83.251.122.201 (talk) 21:11, 15 August 2012 (UTC)

I agree with this last comment, unless you already know how the algorithm works is hard to understand the role of B. On this text "At step i, the matrix A(i) has the following form:" one should stress the sizes of B and b, and that those ones represent a partition of the matrix; also it could be said that when i=1 the top left corner Identity(i-0) and right and bellow zeros do not exist. 83.223.227.96 (talk) 12:34, 9 August 2019 (UTC)

negative complex number
it is said that "L is a lower triangular matrix with nonnegative diagonal entries". L in general is a mtrix with complex numbers. What does it mean that the ocmplex numbers that are on the diagonal are non-negative? — Preceding unsigned comment added by Raffamaiden (talk • contribs) 16:07, 1 May 2013 (UTC)
 * The diagonal entries of L are real. Here is an argument. A is hermitian positive definite, so it has real eigenvalues and is unitarily diagonalizable, and thus has a Hermitian positive definite matrix square root; if A = U D U^H, then S = A^{1/2} = U D^{1/2} U^H, where D^{1/2} has (real) positive entries. S in turn has a QR decomposition S = QR. R is upper triangular with real diagonal entries (the real eigenvalues of S). We have A = S^2 = S^TS = R^H Q^H Q R = LL^H, where L = R^H and thus has real diagonal entries. The cholesky decomposition is unique, so this construction serves to show that the diagonal entries of L are real. Khromegnome (talk) 21:07, 1 May 2013 (UTC)

Cholesky_decomposition
It states
 * $$\| \mathbf{L}_k \|^2 \le \| \mathbf{L}_k \mathbf{L}_k ^* \|.$$

The operator norm is submultiplicative, why should this be the case? --Erzbischof (talk) 11:14, 9 July 2013 (UTC)


 * The case of $$\mathbf{L}_k \mathbf{L}_k ^*$$ is special. For example, in the induced L2 norm (i.e. the operator norm induced by the L2 norm on the vector space), then $$| A |^2 = |  AA^* |$$: $$| A |^2$$ is the square of the maximum singular value of A and this is the same as the maximum singular value of $$AA^*$$.   I'm not sure off the top of my head about other induced norms, but it seems plausible (which doesn't obviate the need for a reference).  Indeed, the whole section lacks references. — Steven G. Johnson (talk) 14:23, 9 July 2013 (UTC)
 * Ah, thank you, so for the spectral norm it holds and then norm equivalence gives a bound anyway. On the other hand, $$| A |^2 = | AA^* |$$ look like a Hilbert space property. --Erzbischof (talk) 21:37, 9 July 2013 (UTC)

External links modified
Hello fellow Wikipedians,

I have just modified one external link on Cholesky decomposition. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:
 * Added archive https://web.archive.org/web/20110716060800/http://www.eti.pg.gda.pl/katedry/kams/wwwkams/pdf/Cholesky_fmprg.pdf to http://www.eti.pg.gda.pl/katedry/kams/wwwkams/pdf/Cholesky_fmprg.pdf

When you have finished reviewing my changes, you may follow the instructions on the template below to fix any issues with the URLs.

Cheers.— InternetArchiveBot  (Report bug) 16:09, 5 August 2017 (UTC)