Polynomial chaos

Polynomial chaos (PC), also called polynomial chaos expansion (PCE) and Wiener chaos expansion, is a method for representing a random variable in terms of a polynomial function of other random variables. The polynomials are chosen to be orthogonal with respect to the joint probability distribution of these random variables. Note that despite its name, PCE has no immediate connections to chaos theory. The word "chaos" here should be understood as "random".

PCE was first introduced in 1938 by Norbert Wiener using Hermite polynomials to model stochastic processes with Gaussian random variables. It was introduced to the physics and engineering community by R. Ghanem and P. D. Spanos in 1991 and generalized to other orthogonal polynomial families by D. Xiu and G. E. Karniadakis in 2002. Mathematically rigorous proofs of existence and convergence of generalized PCE were given by O. G. Ernst and coworkers in 2011.

PCE has found widespread use in engineering and the applied sciences because it makes possible to deal with probabilistic uncertainty in the parameters of a system. In particular, PCE has been used as a surrogate model to facilitate uncertainty quantification analyses. PCE has also been widely used in stochastic finite element analysis and to determine the evolution of uncertainty in a dynamical system when there is probabilistic uncertainty in the system parameters.

Main principles
Polynomial chaos expansion (PCE) provides a way to represent a random variable $$Y$$ with finite variance (i.e., $$\operatorname{Var}(Y)<\infty$$) as a function of an $$M$$-dimensional random vector $$\mathbf{X}$$, using a polynomial basis that is orthogonal with respect to the distribution of this random vector. The prototypical PCE can be written as:


 * $$Y = \sum_{i\in\mathbb{N}}c_{i}\Psi_{i}(\mathbf{X}).$$

In this expression, $$c_{i}$$ is a coefficient and $$\Psi_{i}$$ denotes a polynomial basis function. Depending on the distribution of $$\mathbf{X}$$, different PCE types are distinguished.

Hermite polynomial chaos
The original PCE formulation used by Norbert Wiener was limited to the case where $$\mathbf{X}$$ is a random vector with a Gaussian distribution. Considering only the one-dimensional case (i.e., $$M=1$$ and $$\mathbf{X}=X$$), the polynomial basis function orthogonal w.r.t. the Gaussian distribution are the set of $$i$$-th degree Hermite polynomials $$H_i$$. The PCE of $$Y$$ can then be written as:


 * $$Y = \sum_{i\in\mathbb{N}}c_{i}H_{i}(X)$$.

Generalized polynomial chaos
Xiu (in his PhD under Karniadakis at Brown University) generalized the result of Cameron–Martin to various continuous and discrete distributions using orthogonal polynomials from the so-called Askey-scheme and demonstrated $$L_2$$ convergence in the corresponding Hilbert functional space. This is popularly known as the generalized polynomial chaos (gPC) framework. The gPC framework has been applied to applications including stochastic fluid dynamics, stochastic finite elements, solid mechanics, nonlinear estimation, the evaluation of finite word-length effects in non-linear fixed-point digital systems and probabilistic robust control. It has been demonstrated that gPC based methods are computationally superior to Monte-Carlo based methods in a number of applications. However, the method has a notable limitation. For large numbers of random variables, polynomial chaos becomes very computationally expensive and Monte-Carlo methods are typically more feasible.

Arbitrary polynomial chaos
Recently chaos expansion received a generalization towards the arbitrary polynomial chaos expansion (aPC), which is a so-called data-driven generalization of the PC. Like all polynomial chaos expansion techniques, aPC approximates the dependence of simulation model output on model parameters by expansion in an orthogonal polynomial basis. The aPC generalizes chaos expansion techniques towards arbitrary distributions with arbitrary probability measures, which can be either discrete, continuous, or discretized continuous and can be specified either analytically (as probability density/cumulative distribution functions), numerically as histogram or as raw data sets. The aPC at finite expansion order only demands the existence of a finite number of moments and does not require the complete knowledge or even existence of a probability density function. This avoids the necessity to assign parametric probability distributions that are not sufficiently supported by limited available data. Alternatively, it allows modellers to choose freely of technical constraints the shapes of their statistical assumptions. Investigations indicate that the aPC shows an exponential convergence rate and converges faster than classical polynomial chaos expansion techniques. Yet these techniques are in progress but the impact of them on computational fluid dynamics (CFD) models is quite impressionable.

Polynomial chaos and incomplete statistical information
In many practical situations, only incomplete and inaccurate statistical knowledge on uncertain input parameters are available. Fortunately, to construct a finite-order expansion, only some partial information on the probability measure is required that can be simply represented by a finite number of statistical moments. Any order of expansion is only justified if accompanied by reliable statistical information on input data. Thus, incomplete statistical information limits the utility of high-order polynomial chaos expansions.

Polynomial chaos and non-linear prediction
Polynomial chaos can be utilized in the prediction of non-linear functionals of Gaussian stationary increment processes conditioned on their past realizations. Specifically, such prediction is obtained by deriving the chaos expansion of the functional with respect to a special basis for the Gaussian Hilbert space generated by the process that with the property that each basis element is either measurable or independent with respect to the given samples. For example, this approach leads to an easy prediction formula for the Fractional Brownian motion.

Bayesian polynomial chaos
In a non-intrusive setting, the estimation of the expansion coefficients $$c_{i}$$ for a given set of basis functions $$\Psi_{i}$$ can be considered as a Bayesian regression problem by constructing a surrogate model. This approach has benefits in that analytical expressions for the data evidence (in the sense of Bayesian inference) as well as the uncertainty of the expansion coefficients are available. The evidence then can be used as a measure for the selection of expansion terms and pruning of the series (see also Bayesian model comparison). The uncertainty of the expansion coefficients can be used to assess the quality and trustworthiness of the PCE, and furthermore the impact of this assessment on the actual quantity of interest $$Y$$.

Let $$D= \{\mathbf{X}^{(j)}, Y^{(j)}\}$$ be a set of $$j = 1,...,N_s$$ pairs of input-output data that is used to estimate the expansion coefficients $$c_{i}$$. Let $$M$$ be the data matrix with elements $$[M]_{ij} = \Psi_i(\mathbf{X}^{(j)})$$, let $$\vec Y = (Y^{(1)},..., Y^{(j)},...,Y^{(N_s)})^T$$ be the set of $$N_s$$ output data written in vector form, and let be $$\vec c = (c_1,...,c_i,...,c_{N_p})^T$$ the set of expansion coefficients in vector form. Under the assumption that the uncertainty of the PCE is of Gaussian type with unknown variance and a scale-invariant prior, the expectation value $$\langle \cdot \rangle$$ for the expansion coefficients is

$$\langle \vec c \rangle = (M^T \;M)^{-1}\; M^T\; \vec Y$$

With $$H = (M^T M)^{-1}$$, then the covariance of the coefficients is

$$\text{Cov}(c_m, c_n) = \frac{\chi_{\text{min}}^2}{N_s-N_p-2} H_{m,n}$$

where $$\chi_{\text{min}}^2= \vec Y^T( \mathrm{I}-M\; H^{-1}M^T) \;\vec Y$$is the minimal misfit and $$\mathrm{I}$$ is the identity matrix. The uncertainty of the estimate for the coefficient $$n$$ is then given by $$\text{Var}(c_m) = \text{Cov}(c_m, c_m) $$.Thus the uncertainty of the estimate for expansion coefficients can be obtained with simple vector-matrix multiplications. For a given input propability density function $$ p(\mathbf{X}) $$, it was shown the second moment for the quantity of interest then simply is

$$\langle Y^2 \rangle = \underbrace{\sum_{m,m'} \int\Psi_m (\mathbf{X}) \Psi_{m'} (\mathbf{X} ) \langle c_m\rangle \langle c_{m'} \rangle p(\mathbf{X})\;  dV_{\mathbf{X}} } _{=I_1} + \underbrace{ \sum_{m,m'} \int\Psi_m (\mathbf{X}) \Psi_{m'} (\mathbf{X} )\; \text{Cov}(c_m, c_{m'})\; p(\mathbf{X})\; dV_{\mathbf{X}}} _{=I_2} $$

This equation amounts the matrix-vector multiplications above plus the marginalization with respect to $$\mathbf{X}$$. The first term $$I_1$$ determines the primary uncertainty of the quantity of interest $$Y $$, as obtained based on the PCE used as a surrogate. The second term $$I_2$$ constitutes an additional inferential uncertainty (often of mixed aleatoric-epistemic type) in the quantity of interest $$Y $$ that is due to a finite uncertainty of the PCE. If enough data is available, in terms of quality and quantity, it can be shown that $$\text{Var}(c_m) $$ becomes negligibly small and becomes small This can be judged by simply building the ratios of the two terms, e.g. $$\frac{I_1}{I_1+I_2}$$.This ratio quantifies the amount of the PCE's own uncertainty in the total uncertainty and is in the interval $$[0,1]$$. E.g., if $$\frac{I_1}{I_1+I_2} \approx 0.5$$, then half of the uncertainty stems from the PCE itself, and actions to improve the PCE can be taken or gather more data. If$$\frac{I_1}{I_1+I_2} \approx 1$$, then the PCE's uncertainty is low and the PCE may be deemed trustworthy.

In a Bayesian surrogate model selection, the probability for a particular surrogate model, i.e. a particular set $$S$$ of expansion coefficients $$c_{i}$$ and basis functions $$\Psi_{i}$$, is given by the evidence of the data $$Z_S$$,

$$Z_S = \Omega_{N_p} \mid H \mid^{-1/2} (\chi^2_{\text{min}})^{-\frac{N_s-N_p}{2}} \frac{\Gamma\big(\frac{N_p}{2}\big) \Gamma \big( \frac{N_s-N_p}{2}\big)}{\Gamma \big(\frac{N_s}{2}\big)}$$

where $$\Gamma$$ is the Gamma-function, $$\mid H \mid$$ is the determinant of $$H$$, $$N_s$$ is the number of data, and $$\Omega_{N_p}$$is the solid angle in $$N_p$$dimensions, where $$N_p$$is the number of terms in the PCE.

Analogous findings can be transferred to the computation of PCE-based sensitivity indices. Similar results can be obtained for Kriging.