User:V.Vvudhivanich/sandbox

Proof that NSE cannot exceeds R2

The objective of this article is to proof that Nash Sutcliffe Efficiency (NSE) cannot not exceed the Coefficient of Determination (R2).

Model Error

Y = Y'+ε                                                                                 (1)

Where  ε = the error vector which is an independent randomly distributed with zero mean and variance of Var(ε)

Y = the observed vector with mean of E(Y) and variance of Var(Y)

Y' = the simulated vector with mean of E(Y') and variance of Var(Y')

Statistical derivation of the model performance indices: R2 and NSE

Taking the expected value (E) of Equation (1), we obtain

E(Y) = E(Y' +ε) = E(Y') + E(ε)                                                     (2)

Taking the variance (VAR) of Equation (1), we obtain

Var(Y) = Var(Y'+ε) = Var(Y') +Var(ε) + 2Cov(Y',ε)                         (3)

Where VAR and COV are the variance and covariance operators.

Dividing Equation (3) by Var(Y)

1 = Var(Y')/Var(y) + Var(ε)/Var(Y) + 2Cov(Y',ε)/Var(Y)

Var(Y')/Var(Y) =1 - Var(ε)/Var(Y) -2Cov(Y',ε)/Var(Y)                              (4)

One of the most commonly used indices to present the model efficiency in hydrologic modelling is the coefficient of determination (R2) which is known as the square of correlation coefficient(r). Since r is the ratio of covariance of Y and Y' to the square root of Var(Y)*Var(Y') as shown below.

R2 = Cov2(Y,Y')/(Var(Y)*Var(Y'))                                                    (5)

Since

Cov(Y,Y') = Cov(Y'+ε, Y') = Var(Y') + Cov(Y',ε)

Cov2(Y,Y') = Var2(Y') + Cov2(Y',ε) + 2Var(Y')*Cov(Y',ε)                      (6)

By substituting Cov2(Y,Y') from Equation (6) into Equation (5), we obtain

R2 = Var(Y')/Var(Y) + Cov2(Y',ε)/(Var(Y)*Var(Y')) +2Cov(Y',ε)/Var(Y)  (7)

Substituting  from Equation (4) into Equation (7)

R2 = 1 - Var(ε)/Var(Y) + Cov2(Y',ε)/(Var(Y)*Var(Y'))                           (8)

The sample estimate of R2 is

R2 = 1 - se2/sy2 + s2y,e/(s2y*s2y')                                                      (9)

where

se2 =  sample variance of error

sy2 =  sample variance of y

sy,e = sample covariance of y' and e

e = y -y' = sample error

Besides R2, another model performance index commonly used is the Nash Sutcliffe Efficiency (NSE).

The sample estimate of NSE is defined as:

NSE =1 - Σ(y-y')2/Σ(y-ȳ)2 = 1 - {Σ(e-ē+ē)2/(n-1)}/{Σ(y-ȳ)2/(n-1)}

NSE = 1 - se2/sy2 - {n/(n-1)}*(ē2/sy2)                                              (10)

Where y and y' are the sample observed values and corresponding model predicted values respectively.

ē is the sample mean of the observed values

n is the sample size

From Equation 10 and 9, it can be proved that

NSE ≤ R2                                                                                      (11)

As stated in Equation 1, the error (ε) is independent and randomly distributed with zero mean and variance of Var(ε).

Thus if е̄ and sy',e are equal to zero.

NSE = R2 = 1 - se2/sy2 = sy'2/sy2  = model variance/observed variance  (12)