User:MatrixHe/sandbox

The MgNet is an abstract and unified mathematical framework which simultaneously recovers some ResNet type convolutional neural networks (CNNs) and multigrid methods  for solving discretized partial differential equations (PDEs). As a CNN model, MgNet can be obtained by making some very minor modifications of a classic geometric multigrid method. Actually, connections between ResNet and classical multigrid methods were acknowledged in the original paper of ResNet from the viewpoint how residuals are applied in both methods. MgNet makes such a connection more direct and clear, and it makes it possible to directly obtain a class of efficient CNN models by simply making some very minor modification of a typical multigrid cycle but keeping the identically same algorithm structure.

Main structure and connections with ResNet
One core concept in MgNet, motivated by our research in algebraic multigrid methods, is the distinction between the so-called data and feature spaces (that are dual to each other). Based on this new concept, MgNet and a further research proposes the constrained data-feature mapping model in every grid as

$$A \ast u=f,$$

where $$f$$ belongs to the data space and $$u$$ belongs to the feature space such that

$$u \ge 0$$.

The feature extraction process can then be obtained through an iterative procedure for solving the above system in each grids. For example, if the single step residual correction scheme is applied for the above system, it becomes

$$u^{i} = u^{i-1} + \sigma \circ B^{i} \ast \sigma(f^{} - A\ast u^{i-1}), \quad i = 1:\nu,$$

with $$u \approx u^{\nu}$$.

If the residual of the above iterative $$r^i = f - A\ast u^i$$is further considered, it becomes

$$r^{i} = r^{i-1} - A\ast \sigma \circ B^i\ast\sigma(r^{i-1}), \quad i=1:\nu.$$

This is almost the exact basic block scheme in Pre-act ResNet, which has the form

$$r^{i} = r^{i-1} - A^i \ast \sigma \circ B^i\ast\sigma(r^{i-1}), \quad i=1:\nu.$$

The next figure shows the pseudocode of MgNet:

One thing important to note is that the special MgNet Algorithm 1 is identical to a multigrid cycle if the boxed nonlinear operations are removed in the algorithm.

Summary
By revealing such a direct connection between CNN and multigrid method, this opens up a new door to the design and study of deep learning models from a more mathematical viewpoint and in particular the rich mathematical techniques developed for multigrid method can be applied in the study of deep learning.