Sign function



In mathematics, the sign function or signum function (from signum, Latin for "sign") is a function that has the value $&minus;1$, $+1$ or $0$ according to whether the sign of a given real number is positive or negative, or the given number is itself zero. In mathematical notation the sign function is often represented as $$\sgn x$$ or $$\sgn (x)$$.

Definition
The signum function of a real number $$x$$ is a piecewise function which is defined as follows: $$ \sgn x :=\begin{cases} -1 & \text{if } x < 0, \\ 0 & \text{if } x = 0, \\ 1 & \text{if } x > 0. \end{cases}$$

The law of trichotomy states that every real number must be positive, negative or zero. The signum function denotes which unique category a number falls into by mapping it to one of the values $&minus;1$, $+1$ or $0,$ which can then be used in mathematical expressions or further calculations.

For example: $$\begin{array}{lcr} \sgn(2) &=& +1\,, \\ \sgn(\pi) &=& +1\,, \\ \sgn(-8) &=& -1\,, \\ \sgn(-\frac{1}{2}) &=& -1\,, \\ \sgn(0) &=& 0\,. \end{array}$$

Basic properties
Any real number can be expressed as the product of its absolute value and its sign function: $$ x = |x| \sgn x\,.$$

It follows that whenever $$x$$ is not equal to 0 we have $$ \sgn x = \frac{x}{|x|} = \frac{|x|}{x}\,.$$

Similarly, for any real number $$x$$, $$ |x| = x\sgn x\,. $$ We can also be certain that: $$\sgn (xy)=(\sgn x)(\sgn y)\,,$$ and so $$\sgn (x^n)=(\sgn x)^n\,.$$

Some algebraic identities
The signum can also be written using the Iverson bracket notation: $$\sgn x = -[x < 0] + [x > 0] \,.$$

The signum can also be written using the floor and the absolute value functions: $$\sgn x = \Biggl\lfloor \frac{x}{|x|+1} \Biggr\rfloor - \Biggl\lfloor \frac{-x}{|x|+1} \Biggr\rfloor \,.$$ If $$0^0$$ is accepted to be equal to 1, the signum can also be written for all real numbers as $$\sgn x = 0^ \left ( - x + \left\vert x \right\vert \right ) - 0^ \left ( x + \left\vert x \right\vert  \right ) \,.$$

Discontinuity at zero


Although the sign function takes the value $&minus;1$ when $$x$$ is negative, the ringed point $(0, &minus;1)$ in the plot of $$\sgn x$$ indicates that this is not the case when $$x=0$$. Instead, the value jumps abruptly to the solid point at $(0, 0)$ where $$\sgn(0)=0$$. There is then a similar jump to $$\sgn(x)=+1$$ when $$x$$ is positive. Either jump demonstrates visually that the sign function $$\sgn x$$ is discontinuous at zero, even though it is continuous at any point where $$x$$ is either positive or negative.

These observations are confirmed by any of the various equivalent formal definitions of continuity in mathematical analysis. A function $$f(x)$$, such as $$\sgn(x),$$ is continuous at a point $$x=a$$ if the value $$f(a)$$ can be approximated arbitrarily closely by the sequence of values $$f(a_1),f(a_2),f(a_3),\dots,$$ where the $$a_n$$ make up any infinite sequence which becomes arbitrarily close to $$a$$ as $$n$$ becomes sufficiently large. In the notation of mathematical limits, continuity of $$f$$ at $$a$$ requires that $$f(a_n) \to f(a)$$ as $$n \to \infty$$ for any sequence $$\left(a_n\right)_{n=1}^\infty$$ for which $$a_n \to a.$$ The arrow symbol can be read to mean approaches, or tends to, and it applies to the sequence as a whole.

This criterion fails for the sign function at $$a=0$$. For example, we can choose $$a_n$$ to be the sequence $$1,\tfrac{1}{2},\tfrac{1}{3},\tfrac{1}{4},\dots,$$ which tends towards zero as $$n$$ increases towards infinity. In this case, $$a_n \to a$$ as required, but $$\sgn(a)=0$$ and $$\sgn(a_n)=+1$$ for each $$n,$$ so that $$\sgn(a_n) \to 1 \neq \sgn(a)$$. This counterexample confirms more formally the discontinuity of $$\sgn x$$ at zero that is visible in the plot.

Despite the sign function having a very simple form, the step change at zero causes difficulties for traditional calculus techniques, which are quite stringent in their requirements. Continuity is a frequent constraint. One solution can be to approximate the sign function by a smooth continuous function; others might involve less stringent approaches that build on classical methods to accommodate larger classes of function.

Smooth approximations and limits
The signum function coincides with the limits $$\sgn x=\lim_{n\to\infty}\frac{1-2^{-nx}}{1+2^{-nx}}\,.$$ and $$\sgn x = \lim_{n\to\infty}\frac{2}{\pi}{\rm arctan}(nx)\, = \lim_{n\to\infty}\frac{2}{\pi}\tan^{-1}(nx)\,.$$as well as,

$$\sgn x=\lim_{n\to\infty}\tanh(nx)\,.$$Here, $$\tanh(x)$$ is the Hyperbolic tangent and the superscript of -1, above it, is shorthand notation for the inverse function of the Trigonometric function, tangent.

For $$k>1$$, a smooth approximation of the sign function is $$\sgn x \approx \tanh kx \,.$$ Another approximation is $$\sgn x \approx \frac{x}{\sqrt{x^2 + \varepsilon^2}} \,.$$ which gets sharper as $$\varepsilon\to 0$$; note that this is the derivative of $$\sqrt{x^2+\varepsilon ^2}$$. This is inspired from the fact that the above is exactly equal for all nonzero $$x$$ if $$\varepsilon=0$$, and has the advantage of simple generalization to higher-dimensional analogues of the sign function (for example, the partial derivatives of $$\sqrt{x^2+y^2}$$).

See .

Differentiation and integration
The signum function $$\sgn x$$ is differentiable everywhere except when $$x=0.$$ Its derivative is zero when $$x$$ is non-zero: $$ \frac{\text{d}\, (\sgn x)}{\text{d}x} = 0 \qquad \text{for } x \ne 0\,.$$

This follows from the differentiability of any constant function, for which the derivative is always zero on its domain of definition. The signum $$\sgn x$$ acts as a constant function when it is restricted to the negative open region $$x<0,$$ where it equals $-1$. It can similarly be regarded as a constant function within the positive open region $$x>0,$$ where the corresponding constant is $+1.$ Although these are two different constant functions, their derivative is equal to zero in each case.

It is not possible to define a classical derivative at $$x=0$$, because there is a discontinuity there. Nevertheless, the signum function has a definite integral between any pair of finite values $a$ and $b$, even when the interval of integration includes zero. The resulting integral for $a$ and $b$ is then equal to the difference between their absolute values: $$ \int_a^b (\sgn x) \, \text{d}x = |b| - |a| \,.$$

Conversely, the signum function is the derivative of the absolute value function, except where there is an abrupt change in gradient before and after zero: $$ \frac{\text{d} |x|}{\text{d}x} = \sgn x \qquad \text{for } x \ne 0\,.$$

We can understand this as before by considering the definition of the absolute value $$|x|$$ on the separate regions $$x<0$$ and $$x<0.$$ For example, the absolute value function is identical to $$x$$ in the region $$x>0,$$ whose derivative is the constant value $+1$, which equals the value of $$\sgn x$$ there.

Because the absolute value is a convex function, there is at least one subderivative at every point, including at the origin. Everywhere except zero, the resulting subdifferential consists of a single value, equal to the value of the sign function. In contrast, there are many subderivatives at zero, with just one of them taking the value $$\sgn(0) = 0$$. A subderivative value $0$ occurs here because the absolute value function is at a minimum. The full family of valid subderivatives at zero constitutes the subdifferential interval $$[-1,1]$$, which might be thought of informally as "filling in" the graph of the sign function with a vertical line through the origin, making it continuous as a two dimensional curve.

In integration theory, the signum function is a weak derivative of the absolute value function. Weak derivatives are equivalent if they are equal almost everywhere, making them impervious to isolated anomalies at a single point. This includes the change in gradient of the absolute value function at zero, which prohibits there being a classical derivative.

Although it is not differentiable at $$x=0$$ in the ordinary sense, under the generalized notion of differentiation in distribution theory, the derivative of the signum function is two times the Dirac delta function. This can be demonstrated using the identity $$ \sgn x = 2 H(x) - 1 \,,$$ where $$H(x)$$ is the Heaviside step function using the standard $$H(0)=\frac{1}{2}$$ formalism. Using this identity, it is easy to derive the distributional derivative: $$ \frac{\text{d}\sgn x}{\text{d}x} = 2 \frac{\text{d} H(x)}{\text{d}x} = 2\delta(x) \,.$$

Fourier transform
The Fourier transform of the signum function is $$\int_{-\infty}^\infty (\sgn x) e^{-ikx}\text{d}x = PV\frac{2}{ik},$$ where $$PV$$ means taking the Cauchy principal value.

Complex signum
The signum function can be generalized to complex numbers as: $$\sgn z = \frac{z}{|z|} $$ for any complex number $$z$$ except $$z=0$$. The signum of a given complex number $$z$$ is the point on the unit circle of the complex plane that is nearest to $$z$$. Then, for $$z\ne 0$$, $$\sgn z = e^{i\arg z}\,,$$ where $$\arg$$ is the complex argument function.

For reasons of symmetry, and to keep this a proper generalization of the signum function on the reals, also in the complex domain one usually defines, for $$z=0$$: $$\sgn(0+0i)=0$$

Another generalization of the sign function for real and complex expressions is $$\text{csgn}$$, which is defined as: $$ \operatorname{csgn} z= \begin{cases} 1 & \text{if } \mathrm{Re}(z) > 0, \\ -1 & \text{if } \mathrm{Re}(z) < 0, \\ \sgn \mathrm{Im}(z) & \text{if } \mathrm{Re}(z) = 0 \end{cases} $$ where $$\text{Re}(z)$$ is the real part of $$z$$ and $$\text{Im}(z)$$ is the imaginary part of $$z$$.

We then have (for $$z\ne 0$$): $$\operatorname{csgn} z = \frac{z}{\sqrt{z^2}} = \frac{\sqrt{z^2}}{z}. $$

Polar decomposition of matrices
Thanks to the Polar decomposition theorem, a matrix $$\boldsymbol A\in\mathbb K^{n\times n}$$ ($$n\in\mathbb N$$ and $$\mathbb K\in\{\mathbb R,\mathbb C\}$$) can be decomposed as a product $$\boldsymbol Q\boldsymbol P$$ where $$\boldsymbol Q$$ is a unitary matrix and $$\boldsymbol P$$ is a self-adjoint, or Hermitian, positive definite matrix, both in $$\mathbb K^{n\times n}$$. If $$\boldsymbol A$$ is invertible then such a decomposition is unique and  $$\boldsymbol Q$$ plays the role of $$\boldsymbol A$$'s signum. A dual construction is given by the decomposition $$\boldsymbol A=\boldsymbol S\boldsymbol R$$ where $$\boldsymbol R$$ is unitary, but generally different than $$\boldsymbol Q$$. This leads to each invertible matrix having a unique left-signum $$\boldsymbol Q$$ and right-signum $$\boldsymbol R$$.

In the special case where $$\mathbb K=\mathbb R,\ n=2,$$ and the (invertible) matrix $$\boldsymbol A = \left[\begin{array}{rr}a&-b\\b&a\end{array}\right]$$, which identifies with the (nonzero) complex number $$a+\mathrm i b=c$$, then the signum matrices satisfy $$\boldsymbol Q=\boldsymbol P=\left[\begin{array}{rr}a&-b\\b&a\end{array}\right]/|c|$$ and identify with the complex signum of $$c$$, $$\sgn c = c/|c|$$. In this sense, polar decomposition generalizes to matrices the signum-modulus decomposition of complex numbers.

Signum as a generalized function
At real values of $$x$$, it is possible to define a generalized function–version of the signum function, $$\varepsilon (x)$$ such that $$\varepsilon (x)^2=1$$ everywhere, including at the point $$x=0$$, unlike $$\sgn$$, for which $$(\sgn 0)^2=0$$. This generalized signum allows construction of the algebra of generalized functions, but the price of such generalization is the loss of commutativity. In particular, the generalized signum anticommutes with the Dirac delta function $$\varepsilon (x) \delta(x)+\delta(x) \varepsilon(x) = 0 \, ;$$ in addition, $$\varepsilon (x)$$ cannot be evaluated at $$x=0$$; and the special name, $$\varepsilon$$ is necessary to distinguish it from the function $$\sgn$$. ($$\varepsilon (0)$$ is not defined, but $$\sgn 0=0$$.)