Moreau envelope

The Moreau envelope (or the Moreau-Yosida regularization) $$M_f$$ of a proper lower semi-continuous convex function $$f$$ is a smoothed version of $$f$$. It was proposed by Jean-Jacques Moreau in 1965.

The Moreau envelope has important applications in mathematical optimization: minimizing over $$M_f$$ and minimizing over $$f$$ are equivalent problems in the sense that the sets of minimizers of $$f$$ and $$M_f$$ are the same. However, first-order optimization algorithms can be directly applied to $$M_f$$, since $$f$$ may be non-differentiable while $$M_f$$ is always continuously differentiable. Indeed, many proximal gradient methods can be interpreted as a gradient descent method over $$M_f$$.

Definition
The Moreau envelope of a proper lower semi-continuous convex function $$f$$ from a Hilbert space $$\mathcal{X}$$ to $$(-\infty,+\infty]$$ is defined as

$$M_{\lambda f}(v) = \inf_{x\in\mathcal{X}} \left(f(x) + \frac{1}{2\lambda} \|x - v\|_2^2\right).$$

Given a parameter $$\lambda \in \mathbb{R}$$, the Moreau envelope of $$\lambda f$$ is also called as the Moreau envelope of $$f$$ with parameter $$\lambda$$.

Properties
$$\nabla M_{\lambda f}(x) = \frac{1}{\lambda} (x - \mathrm{prox}_{\lambda f}(x))$$. By defining the sequence $$x_{k+1} = \mathrm{prox}_{\lambda f}(x_k)$$ and using the above identity, we can interpret the proximal operator as a gradient descent algorithm over the Moreau envelope.
 * The Moreau envelope can also be seen as the infimal convolution between $$f$$ and $$(1/2)\| \cdot \|^2_2$$.
 * The proximal operator of a function is related to the gradient of the Moreau envelope by the following identity:

$$M_{\lambda f}(v) = \max_{p \in \mathcal X} \left( \langle p, v \rangle - \frac{\lambda}{2} \| p \|^2 - f^*(p)\right),$$ where $$f^*$$ denotes the convex conjugate of $$f$$. Since the subdifferential of a proper, convex, lower semicontinuous function on a Hilbert space is inverse to the subdifferential of its convex conjugate, we can conclude that if $$p_0 \in \mathcal X$$ is the maximizer of the above expression, then $$x_0 := v - \lambda p_0$$ is the minimizer in the primal formulation and vice versa.
 * Using Fenchel's duality theorem, one can derive the following dual formulation of the Moreau envelope:


 * By Hopf–Lax formula, the Moreau envelope is a viscosity solution to a Hamilton–Jacobi equation. Stanley Osher and co-authors used this property and Cole–Hopf transformation to derive an algorithm to compute approximations to the proximal operator of a function.