Maximum principle

In the mathematical fields of differential equations and geometric analysis, the maximum principle is one of the most useful and best known tools of study. Solutions of a differential inequality in a domain D satisfy the maximum principle if they achieve their maxima at the boundary of D.

The maximum principle enables one to obtain information about solutions of differential equations without any explicit knowledge of the solutions themselves. In particular, the maximum principle is a useful tool in the numerical approximation of solutions of ordinary and partial differential equations and in the determination of bounds for the errors in such approximations.

In a simple two-dimensional case, consider a function of two variables $u(x,y)$ such that
 * $$\frac{\partial^2u}{\partial x^2}+\frac{\partial^2u}{\partial y^2}=0.$$

The weak maximum principle, in this setting, says that for any open precompact subset $M$ of the domain of $u$, the maximum of $u$ on the closure of $M$ is achieved on the boundary of $M$. The strong maximum principle says that, unless $u$ is a constant function, the maximum cannot also be achieved anywhere on $M$ itself.

Such statements give a striking qualitative picture of solutions of the given differential equation. Such a qualitative picture can be extended to many kinds of differential equations. In many situations, one can also use such maximum principles to draw precise quantitative conclusions about solutions of differential equations, such as control over the size of their gradient. There is no single or most general maximum principle which applies to all situations at once.

In the field of convex optimization, there is an analogous statement which asserts that the maximum of a convex function on a compact convex set is attained on the boundary.

A partial formulation of the strong maximum principle
Here we consider the simplest case, although the same thinking can be extended to more general scenarios. Let $M$ be an open subset of Euclidean space and let $u$ be a $C^{2}$ function on $M$ such that
 * $$\sum_{i=1}^n\sum_{j=1}^n a_{ij}\frac{\partial^2u}{\partial x^i\,\partial x^j}=0$$

where for each $i$ and $j$ between 1 and $n$, $a_{ij}$ is a function on $M$ with $a_{ij} = a_{ji}$.

Fix some choice of $x$ in $M$. According to the spectral theorem of linear algebra, all eigenvalues of the matrix $[a_{ij}(x)]$ are real, and there is an orthonormal basis of $ℝ^{n}$ consisting of eigenvectors. Denote the eigenvalues by $λ_{i}$ and the corresponding eigenvectors by $v_{i}$, for $i$ from 1 to $n$. Then the differential equation, at the point $x$, can be rephrased as
 * $$\sum_{i=1}^n \lambda_i \left. \frac{d^2}{dt^2}\right|_{t=0}\big(u(x+tv_i)\big)=0.$$

The essence of the maximum principle is the simple observation that if each eigenvalue is positive (which amounts to a certain formulation of "ellipticity" of the differential equation) then the above equation imposes a certain balancing of the directional second derivatives of the solution. In particular, if one of the directional second derivatives is negative, then another must be positive. At a hypothetical point where $u$ is maximized, all directional second derivatives are automatically nonpositive, and the "balancing" represented by the above equation then requires all directional second derivatives to be identically zero.

This elementary reasoning could be argued to represent an infinitesimal formulation of the strong maximum principle, which states, under some extra assumptions (such as the continuity of $a$), that $u$ must be constant if there is a point of $M$ where $u$ is maximized.

Note that the above reasoning is unaffected if one considers the more general partial differential equation
 * $$\sum_{i=1}^n\sum_{j=1}^n a_{ij}\frac{\partial^2u}{\partial x^i \, \partial x^j}+\sum_{i=1}^n b_i\frac{\partial u}{\partial x^i}=0,$$

since the added term is automatically zero at any hypothetical maximum point. The reasoning is also unaffected if one considers the more general condition
 * $$\sum_{i=1}^n\sum_{j=1}^n a_{ij}\frac{\partial^2u}{\partial x^i \, \partial x^j}+\sum_{i=1}^n b_i\frac{\partial u}{\partial x^i}\geq 0,$$

in which one can even note the extra phenomena of having an outright contradiction if there is a strict inequality ($>$ rather than $≥$) in this condition at the hypothetical maximum point. This phenomenon is important in the formal proof of the classical weak maximum principle.

Non-applicability of the strong maximum principle
However, the above reasoning no longer applies if one considers the condition
 * $$\sum_{i=1}^n\sum_{j=1}^n a_{ij}\frac{\partial^2u}{\partial x^i\,\partial x^j}+\sum_{i=1}^n b_i\frac{\partial u}{\partial x^i}\leq 0,$$

since now the "balancing" condition, as evaluated at a hypothetical maximum point of $u$, only says that a weighted average of manifestly nonpositive quantities is nonpositive. This is trivially true, and so one cannot draw any nontrivial conclusion from it. This is reflected by any number of concrete examples, such as the fact that
 * $$\frac{\partial^2}{\partial x^2}\big({-x}^2-y^2\big)+\frac{\partial^2}{\partial y^2}\big({-x}^2-y^2\big)\leq 0,$$

and on any open region containing the origin, the function $−x^{2}−y^{2}$ certainly has a maximum.

The essential idea
Let $M$ denote an open subset of Euclidean space. If a smooth function $$u:M\to\mathbb{R}$$ is maximized at a point $p$, then one automatically has: One can view a partial differential equation as the imposition of an algebraic relation between the various derivatives of a function. So, if $u$ is the solution of a partial differential equation, then it is possible that the above conditions on the first and second derivatives of $u$ form a contradiction to this algebraic relation. This is the essence of the maximum principle. Clearly, the applicability of this idea depends strongly on the particular partial differential equation in question.
 * $$(du)(p)=0$$
 * $$(\nabla^2 u)(p)\leq 0,$$ as a matrix inequality.

For instance, if $u$ solves the differential equation
 * $$\Delta u=|du|^2+2,$$

then it is clearly impossible to have $$\Delta u\leq 0$$ and $$du=0$$ at any point of the domain. So, following the above observation, it is impossible for $u$ to take on a maximum value. If, instead $u$ solved the differential equation $$\Delta u=|du|^2$$ then one would not have such a contradiction, and the analysis given so far does not imply anything interesting. If $u$ solved the differential equation $$\Delta u=|du|^2-2,$$ then the same analysis would show that $u$ cannot take on a minimum value.

The possibility of such analysis is not even limited to partial differential equations. For instance, if $$u:M\to\mathbb{R}$$ is a function such that
 * $$\Delta u-|du|^4=\int_M e^{u(x)}\,dx,$$

which is a sort of "non-local" differential equation, then the automatic strict positivity of the right-hand side shows, by the same analysis as above, that $u$ cannot attain a maximum value.

There are many methods to extend the applicability of this kind of analysis in various ways. For instance, if $u$ is a harmonic function, then the above sort of contradiction does not directly occur, since the existence of a point $p$ where $$\Delta u(p)\leq 0$$ is not in contradiction to the requirement $$\Delta u=0$$ everywhere. However, one could consider, for an arbitrary real number $s$, the function $u_{s}$ defined by
 * $$u_s(x)=u(x)+se^{x_1}.$$

It is straightforward to see that
 * $$\Delta u_s=se^{x_1}.$$

By the above analysis, if $$s>0$$ then $u_{s}$ cannot attain a maximum value. One might wish to consider the limit as $s$ to 0 in order to conclude that $u$ also cannot attain a maximum value. However, it is possible for the pointwise limit of a sequence of functions without maxima to have a maxima. Nonetheless, if $M$ has a boundary such that $M$ together with its boundary is compact, then supposing that $u$ can be continuously extended to the boundary, it follows immediately that both $u$ and $u_{s}$ attain a maximum value on $$M\cup\partial M.$$ Since we have shown that $u_{s}$, as a function on $M$, does not have a maximum, it follows that the maximum point of $u_{s}$, for any $s$, is on $$\partial M.$$ By the sequential compactness of $$\partial M,$$ it follows that the maximum of $u$ is attained on $$\partial M.$$ This is the weak maximum principle for harmonic functions. This does not, by itself, rule out the possibility that the maximum of $u$ is also attained somewhere on $M$. That is the content of the "strong maximum principle," which requires further analysis.

The use of the specific function $$e^{x_1}$$ above was very inessential. All that mattered was to have a function which extends continuously to the boundary and whose Laplacian is strictly positive. So we could have used, for instance,
 * $$u_s(x)=u(x)+s|x|^2$$

with the same effect.

Summary of proof
Let $M$ be an open subset of Euclidean space. Let $$u:M\to\mathbb{R}$$ be a twice-differentiable function which attains its maximum value $C$. Suppose that
 * $$a_{ij}\frac{\partial^2u}{\partial x^i\,\partial x^j}+b_i\frac{\partial u}{\partial x^i}\geq 0.$$

Suppose that one can find (or prove the existence of):
 * a compact subset $Ω$ of $M$, with nonempty interior, such that $u(x) < C$ for all $x$ in the interior of $Ω$, and such that there exists $x_{0}$ on the boundary of $Ω$ with $u(x_{0}) = C$.
 * a continuous function $$h:\Omega\to\mathbb{R}$$ which is twice-differentiable on the interior of $Ω$ and with
 * $$a_{ij}\frac{\partial^2h}{\partial x^i\,\partial x^j}+b_i\frac{\partial h}{\partial x^i}\geq 0,$$
 * and such that one has $u + h ≤ C$ on the boundary of $Ω$ with $h(x_{0}) = 0$

Then $L(u + h − C) ≥ 0$ on $Ω$ with $u + h − C ≤ 0$ on the boundary of $Ω$; according to the weak maximum principle, one has $u + h − C ≤ 0$ on $Ω$. This can be reorganized to say
 * $$-\frac{u(x)-u(x_0)}{|x-x_0|}\geq \frac{h(x)-h(x_0)}{|x-x_0|}$$

for all $x$ in $Ω$. If one can make the choice of $h$ so that the right-hand side has a manifestly positive nature, then this will provide a contradiction to the fact that $x_{0}$ is a maximum point of $u$ on $M$, so that its gradient must vanish.

Proof
The above "program" can be carried out. Choose $Ω$ to be a spherical annulus; one selects its center $x_{c}$ to be a point closer to the closed set $u^{−1}(C)$ than to the closed set $∂M$, and the outer radius $R$ is selected to be the distance from this center to $u^{−1}(C)$; let $x_{0}$ be a point on this latter set which realizes the distance. The inner radius $ρ$ is arbitrary. Define
 * $$h(x)=\varepsilon\Big(e^{-\alpha|x-x_{\text{c}}|^2}-e^{-\alpha R^2}\Big).$$

Now the boundary of $Ω$ consists of two spheres; on the outer sphere, one has $h = 0$; due to the selection of $R$, one has $u ≤ C$ on this sphere, and so $u + h − C ≤ 0$ holds on this part of the boundary, together with the requirement $h(x_{0}) = 0$. On the inner sphere, one has $u < C$. Due to the continuity of $u$ and the compactness of the inner sphere, one can select $δ > 0$ such that $u + δ < C$. Since $h$ is constant on this inner sphere, one can select $ε > 0$ such that $u + h ≤ C$ on the inner sphere, and hence on the entire boundary of $Ω$.

Direct calculation shows
 * $$\sum_{i=1}^n\sum_{j=1}^na_{ij}\frac{\partial^2h}{\partial x^i\,\partial x^j}+\sum_{i=1}^nb_i\frac{\partial h}{\partial x^i}=\varepsilon \alpha e^{-\alpha|x-x_{\text{c}}|^2}\left(4\alpha\sum_{i=1}^n\sum_{j=1}^n a_{ij}(x)\big(x^i-x_{\text{c}}^i\big)\big(x^j-x_{\text{c}}^j\big)-2\sum_{i=1}^n a_{ii}-2 \sum_{i=1}^n b_i\big(x^i-x_{\text{c}}^i\big)\right).$$

There are various conditions under which the right-hand side can be guaranteed to be nonnegative; see the statement of the theorem below.

Lastly, note that the directional derivative of $h$ at $x_{0}$ along the inward-pointing radial line of the annulus is strictly positive. As described in the above summary, this will ensure that a directional derivative of $u$ at $x_{0}$ is nonzero, in contradiction to $x_{0}$ being a maximum point of $u$ on the open set $M$.

Statement of the theorem
The following is the statement of the theorem in the books of Morrey and Smoller, following the original statement of Hopf (1927): "Let $M$ be an open subset of Euclidean space $ℝ^{n}$. For each $i$ and $j$ between 1 and $n$, let $a_{ij}$ and $b_{i}$ be continuous functions on $M$ with $a_{ij} = a_{ji}$. Suppose that for all $x$ in $M$, the symmetric matrix $[a_{ij}]$ is positive-definite. If $u$ is a nonconstant $C^{2}$ function on $M$ such that
 * $\sum_{i=1}^n\sum_{j=1}^na_{ij}\frac{\partial^2u}{\partial x^i\,\partial x^j}+\sum_{i=1}^nb_i\frac{\partial u}{\partial x^i}\geq 0$

on $M$, then $u$ does not attain a maximum value on $M$." The point of the continuity assumption is that continuous functions are bounded on compact sets, the relevant compact set here being the spherical annulus appearing in the proof. Furthermore, by the same principle, there is a number $λ$ such that for all $x$ in the annulus, the matrix $[a_{ij}(x)]$ has all eigenvalues greater than or equal to $λ$. One then takes $α$, as appearing in the proof, to be large relative to these bounds. Evans's book has a slightly weaker formulation, in which there is assumed to be a positive number $λ$ which is a lower bound of the eigenvalues of $[a_{ij}]$ for all $x$ in $M$.

These continuity assumptions are clearly not the most general possible in order for the proof to work. For instance, the following is Gilbarg and Trudinger's statement of the theorem, following the same proof: "Let $M$ be an open subset of Euclidean space $ℝ^{n}$. For each $i$ and $j$ between 1 and $n$, let $a_{ij}$ and $b_{i}$ be functions on $M$ with $a_{ij} = a_{ji}$. Suppose that for all $x$ in $M$, the symmetric matrix $[a_{ij}]$ is positive-definite, and let $λ(x)$ denote its smallest eigenvalue. Suppose that $\textstyle\frac{a_{ii}}{\lambda}$ and $\textstyle\frac{|b_i|}{\lambda}$ are bounded functions on $M$ for each $i$ between 1 and $n$. If $u$ is a nonconstant $C^{2}$ function on $M$ such that
 * $\sum_{i=1}^n\sum_{j=1}^na_{ij}\frac{\partial^2u}{\partial x^i\,\partial x^j}+\sum_{i=1}^nb_i\frac{\partial u}{\partial x^i}\geq 0$

on $M$, then $u$ does not attain a maximum value on $M$.|undefined"

One cannot naively extend these statements to the general second-order linear elliptic equation, as already seen in the one-dimensional case. For instance, the ordinary differential equation $y + 2y = 0$ has sinusoidal solutions, which certainly have interior maxima. This extends to the higher-dimensional case, where one often has solutions to "eigenfunction" equations $Δu + cu = 0$ which have interior maxima. The sign of c is relevant, as also seen in the one-dimensional case; for instance the solutions to $y - 2y = 0$ are exponentials, and the character of the maxima of such functions is quite different from that of sinusoidal functions.

Research articles

 * Calabi, E. An extension of E. Hopf's maximum principle with an application to Riemannian geometry. Duke Math. J. 25 (1958), 45–56.
 * Cheng, S.Y.; Yau, S.T. Differential equations on Riemannian manifolds and their geometric applications. Comm. Pure Appl. Math. 28 (1975), no. 3, 333–354.
 * Gidas, B.; Ni, Wei Ming; Nirenberg, L. Symmetry and related properties via the maximum principle. Comm. Math. Phys. 68 (1979), no. 3, 209–243.
 * Gidas, B.; Ni, Wei Ming; Nirenberg, L. Symmetry of positive solutions of nonlinear elliptic equations in $R^{n}$. Mathematical analysis and applications, Part A, pp. 369–402, Adv. in Math. Suppl. Stud., 7a, Academic Press, New York-London, 1981.
 * Hamilton, Richard S. Four-manifolds with positive curvature operator. J. Differential Geom. 24 (1986), no. 2, 153–179.
 * E. Hopf. Elementare Bemerkungen Über die Lösungen partieller Differentialgleichungen zweiter Ordnung vom elliptischen Typus. Sitber. Preuss. Akad. Wiss. Berlin 19 (1927), 147-152.
 * Hopf, Eberhard. A remark on linear elliptic differential equations of second order. Proc. Amer. Math. Soc. 3 (1952), 791–793.
 * Nirenberg, Louis. A strong maximum principle for parabolic equations. Comm. Pure Appl. Math. 6 (1953), 167–177.
 * Omori, Hideki. Isometric immersions of Riemannian manifolds. J. Math. Soc. Jpn. 19 (1967), 205–214.
 * Yau, Shing Tung. Harmonic functions on complete Riemannian manifolds. Comm. Pure Appl. Math. 28 (1975), 201–228.
 * Kreyberg, H. J. A. On the maximum principle of optimal control in economic processes, 1969 (Trondheim, NTH, Sosialøkonomisk institutt https://www.worldcat.org/title/on-the-maximum-principle-of-optimal-control-in-economic-processes/oclc/23714026)

Textbooks

 * Evans, Lawrence C. Partial differential equations. Second edition. Graduate Studies in Mathematics, 19. American Mathematical Society, Providence, RI, 2010. xxii+749 pp. ISBN 978-0-8218-4974-3
 * Friedman, Avner. Partial differential equations of parabolic type. Prentice-Hall, Inc., Englewood Cliffs, N.J. 1964 xiv+347 pp.
 * Gilbarg, David; Trudinger, Neil S. Elliptic partial differential equations of second order. Reprint of the 1998 edition. Classics in Mathematics. Springer-Verlag, Berlin, 2001. xiv+517 pp. ISBN 3-540-41160-7
 * Ladyženskaja, O. A.; Solonnikov, V. A.; Uralʹceva, N. N. Linear and quasilinear equations of parabolic type. Translated from the Russian by S. Smith. Translations of Mathematical Monographs, Vol. 23 American Mathematical Society, Providence, R.I. 1968 xi+648 pp.
 * Ladyzhenskaya, Olga A.; Ural'tseva, Nina N. Linear and quasilinear elliptic equations. Translated from the Russian by Scripta Technica, Inc. Translation editor: Leon Ehrenpreis. Academic Press, New York-London 1968 xviii+495 pp.
 * Lieberman, Gary M. Second order parabolic differential equations. World Scientific Publishing Co., Inc., River Edge, NJ, 1996. xii+439 pp. ISBN 981-02-2883-X
 * Morrey, Charles B., Jr. Multiple integrals in the calculus of variations. Reprint of the 1966 edition. Classics in Mathematics. Springer-Verlag, Berlin, 2008. x+506 pp. ISBN 978-3-540-69915-6
 * Protter, Murray H.; Weinberger, Hans F. Maximum principles in differential equations. Corrected reprint of the 1967 original. Springer-Verlag, New York, 1984. x+261 pp. ISBN 0-387-96068-6
 * Smoller, Joel. Shock waves and reaction-diffusion equations. Second edition. Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], 258. Springer-Verlag, New York, 1994. xxiv+632 pp. ISBN 0-387-94259-9
 * Smoller, Joel. Shock waves and reaction-diffusion equations. Second edition. Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], 258. Springer-Verlag, New York, 1994. xxiv+632 pp. ISBN 0-387-94259-9
 * Smoller, Joel. Shock waves and reaction-diffusion equations. Second edition. Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], 258. Springer-Verlag, New York, 1994. xxiv+632 pp. ISBN 0-387-94259-9