User:Aali9520/Tensor Completion

In mathematics, tensor completion (also known as tensor recovery) is the recovery of a tensor when a subset of its entries are missing or distorted. Tensors are commonly used to represent multidimensional data and the problem of estimating them from data is a generalization of matrix completion to higher dimensions. Tensor completion is also closely related to higher-order singular value decomposition. Analogous to the matrix completion problem, structural assumptions about the tensor are usually made, with the most common being low rank. The remainder of this article focuses specifically on low rank tensor completion.

Tensors
A tensor is a multidimensional array. Formally, let $$ \mathbf{X} \in \mathbb{R}^{I_1 \times \cdots \times I_N} $$ be a tensor of order $$N$$ and size $$ I_1 \times \cdots \times I_N $$. The $$(i_1,\ldots,i_N)^\text{th} $$ entry of $$ \mathbf{X} $$ is denoted $$x_{i_1,\ldots,i_N} $$. As a running example (taken from ), let $$\mathbf{X} \in \mathbb{R}^{3 \times 4 \times 2} $$ be the order 3 tensor whose frontal slices are
 * $$x_{\cdot,\cdot,1} = \begin{bmatrix}

1 & 4 & 7 & 10 \\ 2 & 5 & 8 & 11\\ 3 & 6 & 9 & 12 \end{bmatrix}, \;\;\;\; x_{\cdot,\cdot,2} = \begin{bmatrix} 13 & 16 & 19 & 22 \\ 14 & 17 & 20 & 23\\ 15 & 18 & 21 & 24 \end{bmatrix}.$$

Fibers are the higher-order analogue of rows and columns in matrices, defined by fixing all but one index. The mode-n fibers are all of the fibers obtained by fixing all indices except that of dimension n. For example, the mode-1 fibers and mode-2 fibers of a matrix are respectively its columns and rows. In the running example, there are twelve mode-3 fibers of $$\mathbf{X} $$, each of length two with corresponding elements taken from the two frontal slices.

The mode-n matricization (also known as unfolding or flattening) of $$\mathbf{X}$$, denoted $$X_{(n)} $$, is the matrix whose columns are the mode-n fibers. The specific ordering of the columns will not matter here. The three mode-n unfoldings of our running example are
 * $$ X_{(1)} = \begin{bmatrix}

1 & 4 & 7 & 10 & 13 & 16 & 19 & 22\\ 2 & 5 & 8 & 11 & 14 & 17 & 20 & 23\\ 3 & 6 & 9 & 12 & 15 & 18 & 21 & 24 \end{bmatrix}, $$


 * $$ X_{(2)} = \begin{bmatrix}

1 & 2 & 3 & 13 & 14 & 15 \\ 4 & 5 & 6 & 16 & 17 & 18 \\ 7 & 8 & 9 & 19 & 20 & 21 \\ 10 & 11 & 12 & 22 & 23 & 24 \end{bmatrix}, $$


 * $$ X_{(3)} = \begin{bmatrix}

1 & 2 & 3 & 4 & \cdots & 9 & 10 & 11 & 12\\ 13 & 14 & 15 & 16 & \cdots & 21 & 22 & 23 & 24 \end{bmatrix}.$$

The n-rank of $$ \mathbf{X} $$, denoted $$ \text{rank}_n(\mathbf{X}) $$, is defined to be the matrix rank of the mode-n matricization:
 * $$ \text{rank}_n(\mathbf{X}) = \text{rank}(X_{(n)}) $$

This concept of tensor rank is closely related to the Tucker decomposition.

Low rank tensor completion
Let $$\mathbf{M} \in \mathbb{R}^{I_1 \times \cdots \times I_N} $$ be the true tensor to be recovered. It is assumed that the n-ranks are "small," i.e. $$ \text{rank}_n(\mathbf{X}) $$ is small compared to $$I_n $$ for some or all n.

Completing missing entries
A standard problem is to observe some entries of $$\mathbf{M} $$ and fill in the remaining entries. Let $$ \Omega $$ be the set of indices observed. Then the goal is to recover the tensor of minimum rank among all tensors that match the data. This is formulated as
 * $$\begin{align}

& \operatorname{minimize}_\mathbf{X} & & \sum_{n=1}^N \text{rank}(X_{(n)}) \\ & \operatorname{subject\;to} & &  \mathbf{X}_{i_1,\ldots,i_N} = \mathbf{M}_{i_1,\ldots,i_N} \;\; \forall (i_1,\ldots,i_N)\in\Omega. \end{align}$$.

Unfortunately, this problem is difficult to solve as the function $$ \text{rank}(X) $$ is not convex. In fact, even in the special case that $$ \mathbf{M} $$ is a matrix, the problem is NP-hard.

In matrix completion, a common approach is to replace the rank in the objective function with the 1-schatten norm (also called nuclear norm), denoted $$\| \cdot \|_* $$. This gives a convex problem that is tractable and enjoys a number of recovery guarantees. One such guarantee is the celebrated result of Emmanuel Candès and Benjamin Recht : given a $$I \times I$$ matrix of rank $$r$$ satisfying certain incoherence properties, if $$m$$ entries are chosen uniformly at random and observed, then the matrix can be exactly recovered with high probability from nuclear norm minimization as long as $$ m \gg r I^{5/4} \log I $$.

The same idea can be extended to the tensor completion problem above, i.e. the following convex problem is solved:
 * $$\begin{align}

& \operatorname{minimize}_\mathbf{X} & & \sum_{n=1}^N \|X_{(n)}\|_* \\ & \operatorname{subject\;to} & &  \mathbf{X}_{i_1,\ldots,i_N} = \mathbf{M}_{i_1,\ldots,i_N} \;\; \forall (i_1,\ldots,i_N)\in\Omega. \end{align}$$

General tensor recovery
The information about $$\mathbf{M} $$ may be more general than above. Let $$ \mathcal{A}: \mathbb{R}^{I_1 \times \cdots \times I_N} \to \mathbb{R}^M $$ be a linear map, and let $$ b $$ be a possibly noisy observation of $$ \mathcal{A}(\mathbf{M}) $$. Then the problem is for some appropriate choice of $$\epsilon $$:
 * $$\begin{align}

& \operatorname{minimize}_\mathbf{X} & & \sum_{n=1}^N \|X_{(n)}\|_* \\ & \operatorname{subject\;to} & &  \| \mathcal{A}(\mathbf{X}) - b \|_2 \le \epsilon. \end{align}$$

Recovery guarantees
Finding recovery guarantees for nuclear norm minimization in tensor completion that are analogous to existing guarantees in matrix completion remains an open problem. One result for the case of noisy observations states that for an order $$N$$ tensor that is balanced, i.e. of size $$I \times \cdots \times I $$ and $$\text{rank}_n = r $$ for all $$n $$, if each entry is distorted by an independent Gaussian variable with mean zero and variance $$\sigma^2 $$, then nuclear norm minimization recovers the original tensor with high probability as long as $$ \sigma^2 \ll r/n $$. This can be extended to unbalanced tensors.

Algorithms
There is a large amount of research into solution techniques for the low rank tensor recovery formulation above. For convenience, the unconstrained, penalized formulation is usually considered:


 * $$\operatorname{minimize}_\mathbf{X} \;\; \lambda \|\mathcal{A}(\mathbf{X}) - b\|_2 + \sum_{n=1}^N \|X_{(n)}\|_* $$

The problem falls into the class of convex optimization problems, for which a variety of solution methods exist. However, some methods may be inappropriate for this particular objective function given the sum of many functions and the non-differentiability of the nuclear norm. Furthermore, tensor data is often very large in practice, making standard methods intractable.

Proximal gradient methods, designed for optimizing the sum of convex, possibly non-differentiable functions, are a natural candidate. In particular, this problem fits into the area of proximal gradient methods for learning. Some solution approaches that have had computational success are the Alternating-direction method of multipliers and the Douglas-Rachford splitting technique , though this still remains an active area of research.

Applications
One major application of tensor completion is the recovery of visual data. Digital two-dimensional monochromatic images can be represented as a matrix, where each entry is the intensity of the corresponding pixel, and matrix completion techniques can be used to recover images with missing or distorted entries. Similarly, tensors of order three can be used to represent three-dimensional images, where each entry is the intensity of the corresponding voxel. Three-dimensional images often arise in medical and scientific settings. For example, magnetic resonance imaging produces three-dimensional images, and tensor completion can be used when observations are missing or distorted. There are heuristic methods for applying matrix completion to three-dimensional images, including completing the entries of one of the matricizations, or completing two-dimensional slices of the image separately, but tensor completion has been shown experimentally to outperform these heuristics.

In addition, tensors can represent a wide variety of other visual data. Monochromatic videos can be represented as tensors of order 3, where two-dimensional images are stacked along the time axis. Polychromatic data can also be represented by adding an additional dimension to contain the multiple values needed to represent colors.

There are many other applications of tensors in which low rank completion is used, including
 * Audio processing
 * Computational linguistics
 * Signal processing for radar
 * Latent variable models
 * Multi-task learning