Talk:Orthogonal Procrustes problem

Why it must be $$A^T B$$ and not $$A^{-1} B$$
Minimizing the Frobenius norm of $$A\Omega-B$$ is equivalent to maximizing the trace of $$\Omega^T A^T B$$, since $$\|C\|_F^2=tr(C^TC)$$. Now, using the SVD $$A^TB=U\Sigma V^T$$, we get $$tr(\Omega^T A^T B)=tr(\Omega^T U\Sigma V^T)=tr(V^T \Omega^T U\Sigma)$$. Now, $$V^T \Omega^T U$$ is orthogonal and $$\Sigma$$ is diagonal and positive, and thus the trace get's maximized if $$V^T \Omega^T U$$ is the identity, therefore $$\Omega^T=V U^T$$. No assumption on the $$A$$ and $$B$$ concerning orthogonality has been made, only the orthogonality of $$V$$ and $$\Omega$$ has been used. (see also Golub/van Loan or the Sch&ouml;nemann paper. Ezander (talk) 11:34, 24 March 2010 (UTC)

A proof candidate
Let $$Q \in \mathbb{R}^{n \times n}$$ be an orthogonal matrix and $$M \in \mathbb{R}^{n \times n}$$ be an arbitrary matrix. Since

$$ \begin{align} \lVert Q - M \rVert^2 & = tr((Q - M)^T (Q - M)) \\ & = tr(Q^T Q - Q^T M - M^T Q + M^T M) \\ & = tr(Q^T Q) - tr(Q^T M) - tr(M^T Q) + tr(M^T M) \\ & = tr(I) + tr(M^T M) - 2tr(Q^T M), \end{align} $$

minimizing the Frobenius norm of the difference is equivalent to maximizing $$tr(Q^T M)$$. Let $$M = U \Sigma V^T$$ be the singular value decomposition of $$M$$, where $$U$$ and $$V$$ are orthogonal and $$\Sigma$$ is diagonal. Now

$$ \begin{align} tr(Q^T M) & = tr(Q^T U \Sigma V^T) \\ & = tr(V^T Q^T U \Sigma) \\ & = tr(W \Sigma), \end{align} $$

where $$W = V^T Q^T U$$ is an arbitrary orthogonal matrix. Furthermore,

$$ \begin{align} tr(W \Sigma) & = \sum_{i = 1}^n \sum_{k = 1}^n W_{ik} \Sigma_{ki} & = \sum_{i = 1}^n W_{ii} \Sigma_{ii}, \end{align} $$

since $$\Sigma$$ is diagonal. Now, $$|W_{ii}|^2 \leq \sum_{k = 1}^n |W_{ki}|^2 \leq 1 $$, since the columns of $$W$$ are unit vectors. Over arbitrary matrices $$W$$, without the orthogonality constraint, but with the diagonal constraint, $$\forall i: W_{ii} = 1$$ maximizes $$tr(W \Sigma)$$, since the diagonal elements of $$\Sigma$$ are non-negative. However, the identity matrix $$I$$ is an orthogonal matrix with this property. Thus $$V^T Q^T U = I$$, from which it follows

$$Q = U V^T$$.

--Kaba3 (talk) 12:42, 28 October 2011 (UTC)


 * Ahh, I just noticed that you gave an equivalent proof here. I tried to keep the one in the article short and simple for the sake of brevity and illustration, but let me know if I've been too terse. Thanks. Willem (talk) 00:30, 23 April 2015 (UTC)

Errors in page
A and B have the same size. Let's say m x n. Then omega and R must have size m x m, since we are multiplying A on the right by omega.

The "Solution" section then says to take the SVD of A^T * B. That has size n x n. So U, sigma, and V all have size n x n. So V * U^T has size n x n.

Clearly there are errors in here somewhere. I think if the objective was to minimize the norm of A*omega - B it would be correct, because then n x n would be the correct size for omega.

I don't know the markup used by Wikipedia. Could someone fix this? — Preceding unsigned comment added by 130.33.205.56 (talk) 00:06, 17 November 2015 (UTC)


 * I agree, there's this inconsistency in the page. I was actually trying to fix it but I realised then that the whole proof needs changing so I'll do that when I have more time. -- Roberto  →@me 13:29, 17 May 2016 (UTC)
 * I see this issue has not yet been addressed. The size of A, B, Ω, R should be specified in the article. The simplest way is to make them all square so that none of the issues mentioned here arise. I guess ideally we would have the general problem, solution, and proof. Not sure if we have the time for that, so please let me know if I can edit it at least to be correct for square matrices A, B. Sunbeam44 (talk) 20:28, 15 March 2024 (UTC)