Functional additive models

In statistics, functional additive models (FAM) can be viewed as extensions of generalized functional linear models where the linearity assumption between the response (scalar or functional) and the functional linear predictor is replaced by an additivity assumption.

Functional Additive Model
In these models, functional predictors ($$ X $$) are paired with responses ($$ Y $$) that can be either scalar or functional. The response can follow a continuous or discrete distribution and this distribution may be in the exponential family. In the latter case, there would be a canonical link that connects predictors and responses. Functional predictors (or responses) can be viewed as random trajectories generated by a square-integrable stochastic process. Using functional principal component analysis and the Karhunen-Loève expansion, these processes can be equivalently expressed as a countable sequence of their functional principal component scores (FPCs) and eigenfunctions. In the FAM the responses (scalar or functional) conditional on the predictor functions are modeled as function of the functional principal component scores of the predictor function in an additive structure. This model can be categorized as a Frequency Additive Model since it is additive in the predictor FPC scores.

Continuously Additive Model
The Continuously Additive Model (CAM) assumes additivity in the time domain. The functional predictors are assumed to be smooth across the time domain since the times contained in an interval domain are an uncountable set, an unrestricted time-additive model is not feasible. This motivates to approximate sums of additive functions by integrals so that the traditional vector additive model be replaced by a smooth additive surface. CAM can handle generalized responses paired with multiple functional predictors.

Functional Generalized Additive Model
The Functional Generalized Additive Model (FGAM) is an extension of generalized additive model with a scalar response and a functional predictor. This model can also deal with multiple functional predictors. The CAM and the FGAM are essentially equivalent apart from implementation details and therefore can be covered under one description. They can be categorized as Time-Additive Models.

Model
Functional Additive Model for scalar and functional responses respectively, are given by
 * $$ E(Y\mid X) = \mu_Y + \sum_{k=1}^\infty f_k(\xi_k) $$
 * $$ E(Y(t)\mid X) = \mu_Y(t) + \sum_{k=1}^\infty \sum_{m=1}^\infty f_{km}(\xi_k)\psi_m(t),$$

where $$ \xi_k $$ and $$ \zeta_m $$ are FPC scores of the processes $$ X $$ and $$ Y $$ respectively, $$ \phi_k $$ and $$ \psi_m $$ are the eigenfunctions of processes $$ X $$ and $$ Y $$ respectively, and $$ f_k$$ and $$ f_{km} $$ are arbitrary smooth functions.

To ensure identifiability one may require, $$ Ef_k(\xi_k) = 0,\quad k=1,2,\ldots Ef_{km}(\xi_k) = 0, k=1,2,\ldots m=1,2,\ldots $$

Implementation
The above model is considered under the assumption that the true FPC scores $$ \xi_k $$ for predictor processes are known. In general, estimation in the generalized additive model requires backfitting algorithm or smooth backfitting to account for the dependencies between predictors. Now FPCs are always uncorrelated and if the predictor processes are assumed to be gaussian then the FPCs are independent. Then
 * $$ E(Y-\mu_Y|\xi_k)=E\{E(Y-\mu_Y|X)|\xi_k\}=E\{\sum_{j=1}^{\infty}f_j(\xi_j)|\xi_k\}=f_k(\xi_k),$$

similarly for functional responses
 * $$ E(\zeta_m|\xi_k)=f_{km}(\xi_k), $$

This simplifies the estimation and requires only one-dimensional smoothing of responses against individual predictor scores and will yield consistent estimates of $$ f_j. $$ In data analysis one needs to estimate $$ \xi_k $$ before proceeding to infer the functions $$ f_k $$ and $$ f_{km} $$, so there are errors in the predictors. functional principal component analysis generates estimates $$ \hat{\xi_k} $$ of $$ \xi_k $$ for individual predictor trajectories along with estimates for eigenfunctions, eigenvalues, mean functions and covariance functions. Different smoothing methods can be applied to the data $$ \{\hat{\xi}_{ik},Y_i\}_{i=1,...,n} $$ and $$ \{\hat{\xi}_{ik},\hat{\zeta}_{im}\}_{i=1,...,n} $$ to estimate $$ f_k $$ and $$ f_{km} $$ respectively.

The fitted Functional Additive Model for scalar response is given by
 * $$ \hat{E}(Y|X)=\bar{Y}+\sum_{k=1}^{K}\hat{f_k}(\xi_k),$$ and the fitted Functional Additive Model for functional responses is by
 * $$ \hat{E}(Y(t)|X)=\hat{\mu}_Y(t)+\sum_{m=1}^{M}\sum_{k=1}^{K}\hat{f}_{km}(\xi_k)\hat{\psi}_m(t), t\in{T} $$

Note: The truncation points $$ K $$ and $$ M $$ need to be chosen data-adaptively. Possible methods include pseudo-AIC, fraction of variance explained or minimization of prediction error or cross-validation.

Extensions
For the case of multiple functional predictors with a scalar response, the Functional Additive Model can be extended by fitting a functional regression which is additive in the FPCs of each of the predictor processes $$ X_j,j=1,...,d $$. The model considered here is Additive Functional Score Model (AFSM) given by
 * $$ E(Y|X_1,X_2,...,X_d)=\sum_{j=1}^{d}\sum_{k}f_{jk}(\xi_{jk}) $$

In case of multiple predictors the FPCs of different predictors are in general correlated and a smooth backfitting technique has been developed to obtain consistent estimates of the component functions $$ f_{jk} $$ when the predictors are observed with errors having unknown distribution.

Model
Since the number of time points on an interval domain is uncountable, an unrestricted time-additive model $$ E(Y|X)=\sum_{\{t\in[0,T]\}}f_t(X(t))$$ is not feasible. Thus a sequence of time-additive models is considered on an increasingly dense finite time grid $$ t_1,t_2,...,t_m $$ in $$ T $$ leading to
 * $$ E(Y|X(t_1 ),...,X(t_m))= E(Y)+\sum_{j=1}^mf_j(X_{t_j}) $$ where $$ f_j(\cdot)=g(t_j,\cdot) $$ for a smooth bivariate function $$ g $$ with $$ E(\{g(t_j,X(t_j)\})=0 $$ (to ensure identifiability). In the limit $$ m\rightarrow\infty $$ this becomes the continuously additive model
 * $$ E(Y|X)=E(Y)+\lim_{m\to\infty}\frac{1}{m}\sum_{j=1}^mg\{t_j,X(t_j)\}=E(Y)+\int_{T}g\{t,X(t)\}dt. $$

Generalized Functional Linear Model
For $$ g\{t,X(t)\}=\beta(t)\{X(t)-EX(t)\} $$ the model reduces to generalized functional linear model

Functional Transformation Model
For non-Gaussian predictor process, $$ g\{t,X(t)\}=\beta(t)[\zeta\{X(t)\}-E\zeta\{X(t)\}, $$ where $$ \zeta $$ is a smooth transformation of $$ X(t) $$ reduces CAM to a Functional Transformation model.

Extensions
This model has also been introduced with a different notation under the name Functional Generalized Additive Model (FGAM). Adding a link function $$ h $$ to the mean-response and applying a probability transformation $$ G_t $$ to $$ X(t) $$ yields the FGAM given by
 * $$ h(E(Y|X)=\theta_0+\int_{T}F[G_t\{X_i(t)\},t]dt, $$

where $$ \theta_0 $$ is the intercept.

Note: For estimation and implementation see