User:SuneTJ/sandbox

Risk management is widely used in most larger projects as there is almost always risk threatening the optimal completion of the project. Once the relevant risks have been identified, some are mediated away through for example insurance or added controls, the remaining risks are called inherent risks or unhedgeable risks.

In general, a risk is characterised by a probability distribution over severity, where severity is a metric of importance (for example dollars or time lost in case the risk materialises). This characterisation allows for quantitative risk management, that is the management of risks through the use of stochastic modelling to predict likely outcomes of the project by aggregating up all relevant risks. Typically the risks are not all independent. Hence Summation of dependent variables is a problem that almost always arises in quantitative risk management.

The problem of the sum of dependent variables
Defining the total risk, $$X$$, as the sum over N relevant risks characterised by the random variables, $$X_i$$:

$$ X = \sum_{i=1}^N X_i $$  (equation 1)

What we would like to know is the probability distribution $$P(X)$$, so the question is what do we need to know about each of the probability distributions $$P_i(X_i)$$ and how to technically do this sum.

Please note that without loss of generality, we can assume that the expected value $$ E(X_i)=0 $$ as any non-zero averages can be subtracted or added on both sides of equation (1) at any time.

A method of estimating the sum
In the simple case of independent risks, two methods are well know

Summation of moments

Convolution of Probability Distributions

In the case where $$P(X)$$ is unimodal, we can use the following method of estimating the sum.

By the definition of the characteristic function $$CF(X) $$ and the Fourier transform, we get


 * $$ P(X) = \mathcal{F}(CF(X)) = \mathcal{F}(E(e^{itX}))$$,

where $$E$$ denotes the expected value. Using the Taylor expansion of the exponential function, we get


 * $$ \mathcal{F}(E(e^{itX})) = \mathcal{F}\left(E\left(\sum_{j=0}^\infty \frac{(itX)^j}{j!}\right)\right) = \mathcal{F}\left(\sum_{j=0}^\infty \frac{(it)^j E\left(X^j\right)}{j!}\right)$$.

Recalling that $$ X = \sum_i^N X_i $$, we see that


 * $$E\left(X^j\right) = E\left(\left(\sum_{i=1}^N X_i\right)^j\right) = E\left(\sum_{S_j \in \Omega_j}\left(\prod_{i \in S_j} X_i\right)\right)= \sum_{S_j \in \Omega_j}\left(E\left(\prod_{i \in S_j} X_i\right)\right)$$,

where $$\Omega_j$$ is the set of all multisets containing exactly $$j$$ elements from $$\{1,...,N\}$$.

By definition of covariance, we have $$\prod_{i \in S_j} X_i = cov_{S_j}$$. Note that here we make use of the assumption that $$E(X_i) = 0 $$.

We use our data set to estimate $$cov_{S_j}$$. This will give us an expected value $$E(cov_{S_j})$$ of the covariances. Since there are $$N^j$$ elements of $$\Omega_j$$, this gives us


 * $$E\left(X^j\right) = \sum_{S_j \in \Omega_j}\left(E\left( cov_{S_j}\right)\right) = N^jE(cov_{S_j}) $$.

So, if we sample the data, we can estimate $$P(X)$$ in the following way:


 * $$P(X) = \mathcal{F}\left(\sum_{j=0}^\infty \frac{(itN)^jE(cov_{S_j})}{j!}\right)$$.

When the number of variables increases, it is convenient with a method of estimating the sum that require less computing power.