User:Poorbita/sandbox

Functional nonlinear regression models
Direct nonlinear extensions of the classical functional linear regression models (FLMs) still involve a linear predictor, but combine it with a nonlinear link function, analogous to the idea of generalized linear model from the conventional linear model. Developments towards fully nonparametric regression models for functional data encounter problems such as curse of dimensionality. In order to bypass the “curse” and the metric selection problem, we are motivated to consider nonlinear functional regression models, which are subject to some structural constraints but do not overly infringe flexibility. One desires models that retain polynomial rates of convergence, while being more flexible than, say, functional linear models. Such models are particularly useful when diagnostics for the functional linear model indicate lack of fit, which is often encountered in real life situations. Functional nonlinear regression models such as generalized functional linear models and extensions to single and multiple index models provide not only enhanced flexibility but also structural stability while model fits converge at polynomial rates. In particular, functional polynomial models, functional single and multiple index models and functional additive models are three special cases of functional nonlinear regression models.

Functional polynomial regression models
Functional polynomial regression models may be viewed as a natural extension of the Functional Linear Models (FLMs) with scalar responses, analogous to extending linear regression model to polynomial regression model. For a scalar response $$Y$$ and a functional covariate $$X(\cdot)$$ with domain $$\mathcal{T}$$ and the corresponding centered predictor processes $$X^c$$, the simplest and the most prominent member in the family of functional polynomial regression models is the quadratic functional regression given as follows,

$$\mathbb{E}(Y|X) = \alpha + \int_\mathcal{T}\beta(t)X^c(t)\,dt + \int_\mathcal{T} \int_\mathcal{T} \gamma(s,t) X^c(s)X^c(t) \,ds\,dt $$

where $$X^c(\cdot) = X(\cdot) - \mathbb{E}(X(\cdot))$$ is the centered functional covariate, $$\alpha$$ is a scalar coefficient, $$\beta(\cdot)$$ and $$\gamma(\cdot,\cdot)$$ are coefficient functions with domains $$\mathcal{T}$$ and $$\mathcal{T}\times\mathcal{T}$$, respectively. In addition to the parameter function β that the above functional quadratic regression model shares with the functional linear model, it also features a parameter surface γ. By analogy to FLMs with scalar responses, estimation of functional polynomial models can be obtained through expanding both the centered covariate $$X^c$$ and the coefficient functions $$\beta$$ and $$\gamma$$ in an orthonormal basis.

The extension to higher order polynomials is pretty obvious. These models can be equivalently represented as polynomials in the corresponding Functional Principal Components (FPCs). Now a natural question is whether the linear model is sufficient or needs to be extended to a model that includes a quadratic term. There is a corresponding test to answer this question.

Functional single and multiple index models
A functional multiple index model is given as below, with symbols having their usual meanings as formerly described,

$$ \mathbb{E}(Y|X) = g\left(\int_{\mathcal{T}} X^c(t) \beta_1(t)\,dt, \ldots, \int_{\mathcal{T}} X^c(t) \beta_p(t)\,dt \right) $$

Here g represents an (unknown) general smooth function defined on a p-dimensional domain. The case $$p=1$$ yields a functional single index model while multiple index models correspond to the case  $$p>1$$. However, for $$p>1$$, this model is problematic due to curse of dimensionality. With $$p>1$$ and relatively small sample sizes, the estimator given by this model often has large variance.

In order to deal with this curse of dimensionality in multiple index models with $$p$$-dimensional smoother, we resort to an alternative $$p$$-component functional multiple index model expressed as, $$\mathbb{E}(Y|X) = g_1\left(\int_{\mathcal{T}} X^c(t) \beta_1(t)\,dt\right)+ \cdots+ g_p\left(\int_{\mathcal{T}} X^c(t) \beta_p(t)\,dt \right)$$

Functional additive models (FAMs)
Let $$X(t) = \sum_{k=1}^\infty x_k \phi_k(t)$$ denote an expansion of a functional covariate $$X$$ with domain $$\mathcal{T}$$ in an orthonormal basis $$\{\phi_k\}_{k=1}^\infty$$ of the function space.

A functional linear model with scalar responses [as shown in model ($$) of FLM] can be thus be written as follows,

$$\mathbb{E}(Y|X)=\mathbb{E}(Y) + \sum_{k=1}^\infty \beta_k x_k.$$

One form of FAMs is obtained by replacing the linear function of $$x_k$$ in the above expression ( i.e., $$\beta_k x_k$$) by a general smooth function $$f_k$$, analogous to the extension of multiple linear regression models to additive models.

The resulting form of functional additive model is thus given by, $$\mathbb{E}(Y|X)=\mathbb{E}(Y) + \sum_{k=1}^\infty f_k(x_k),$$ where $$f_k$$ satisfies $$\mathbb{E}(f_k(x_k))=0$$ for $$k\in\mathbb{N}$$. This constraint on the general smooth functions $$f_k$$ ensures identifiability in the sense that the estimates of these additive component functions do not interfere with that of the intercept term $$\mathbb{E}(Y)$$.

Another form of FAMs comprises a sequence of time-additive models on increasingly dense finite grids of size p and under the assumption that functions $$f_j$$ are smoothly varying in t. Such models may be described as, $$\mathbb{E}(Y|X(t_1),\ldots,X(t_p))=\sum_{j=1}^p f_j(X(t_j)),$$ where $$\{t_1,\ldots,t_p\}$$ is a dense grid on $$\mathcal{T}$$ with increasing size $$p\in\mathbb{N}$$, and $$f_j(x) = g(t_j,x)$$ for a smooth bivariate function g., $$j=1,\ldots,p$$. In the limit p tends to infinity, the above FAM model becomes the continuously additive model.

Extensions
An obvious and direct extension of FLMs with scalar responses [shown in model ($$)] is to add a link function so as to create a generalized functional linear model (GFLM) by analogy to extending linear model to generalized linear model (GLM), of which the three components are:


 * 1) Linear predictor $$\eta = \beta_0 + \int_{\mathcal{T}} X^c(t)\beta(t)\,dt$$;  [systematic component]
 * 2) Variance function $$\text{Var}(Y|X) = V(\mu)$$, where $$\mu = \mathbb{E}(Y|X)$$ is the conditional mean;  [random component]
 * 3) Link function $$g$$ connecting the conditional mean $$\mu$$ and the linear predictor $$\eta$$ through $$\mu=g(\eta)$$.  [systematic component]

Classification of functional data
Functional clustering (discussed in the last section) aims at finding clusters by either (i) minimizing an objective function [such as (8) in functional clustering section above], or more generally, (ii) by maximizing the conditional cluster membership probability given predictor $$X_i$$ [denoted by $$P(c|X_i)$$]. Functional classification on the other hand assigns a group membership to a new data object on the basis of a discriminant function or a classifier. In particular classification for functional data is either based on functional regression or functional discriminant analysis. Most of the popular approaches for functional data classification are based on functional regression models that feature class labels as responses and the observed functional data and other covariates as predictors. This leads to regression based functional data classification methods, such as, functional generalized linear regression models and functional multiclass logit models. Similar to functional data clustering, most functional data classification methods apply a dimension reduction technique using a truncated expansion in a pre-specified function basis or in the data-adaptive eigen basis.

Functional regression for classification
As already stated, functional data classification methods based on functional regression models use class levels as responses and the observed functional data and other covariates as predictors. For regression based functional classification models, functional generalized linear models or more specifically, functional binary regression, such as functional logistic regression for binary response, are commonly used classification approaches. We have a functional multiclass logit model to deal with multiclass classification problems as follows.

Let us consider a random sample {$${(Z_i,X_i)i=1,2,...,n}$$}, where $$Z_i$$ represents the class label for the $$i^{th}$$ unit under study with label $$Z_i$$ ∈ {1, 2,. . ., L} for L classes under consideration and $$X_i$$ represents the associated functional observation for the same individual.

A classification model for an observation $$X_0$$ based on functional multiclass logit regression is thus given by,

$$\log [Pr(Z=k|X_0) / Pr(Z=L|X_0)] = \gamma_{0k} + \int_{\mathcal{T}} X_0(t) \gamma_{1k}(t)\,dt\ $$,

where k = 1, 2,..., L-1. Clearly here $$\gamma_{0k} $$ is an intercept term and $$\gamma_{1k}(t) $$ denotes the coefficient function corresponding to the kth class label. Also as is quite apparent, here we use the class label L as the baseline category and thus take,

$$Pr(Z=L|X_0) = 1 - \sum_{k=1}^{L-1} Pr(Z=k|X_0) $$.

This is thus a functional extension of the baseline odds model in multinomial regression.

Given a new observation $$X_0 $$, the model based Bayes classification rule is to choose the class label $$Z_0 $$ with the maximal posterior probability among $$Pr(Z=k|X_0), k = 1,2,...,L $$. More generally, the generalized functional linear regression model based on the FPCA approach is used. When the logit link is used in the model, it becomes the functional logistic regression model, several variants of which have been studied in literature.

Functional discriminant analysis for classification
Functional Linear Discriminant Analysis (FLDA) has also been considered as a classification method for functional data. Developments for functional data along these lines include, Functional linear discriminant analysis approach to classify curves, Functional data-analytic approach to signal discrimination, using the FPCA method for dimension reduction and Kernel functional classification rules for nonparametric curve discrimination. Another very important development in the domain of functional data classification involves the use of density ratios of projections and is described below.

We must first take a note that Bayes classifiers for functional data pose several challenges. A major difficulty is that probability density functions do not exist for functional data, so that the classical Bayes classifier using density quotients needs to be modified. The proposed classifiers can in fact be viewed as an extension to functional data of some of the very earliest nonparametric Bayes classifiers that were based on simple density ratios in the one-dimensional case. By means of the factorization of the density quotients, the curse of dimensionality that would otherwise severely affect Bayes classifiers for functional data, can be avoided.

A study of the asymptotic behaviour of the proposed classifiers in the large sample limit shows that under certain conditions the misclassification rate converges to zero, a phenomenon that has been referred to as "perfect classification". The proposed classifiers also perform favourably in finite sample applications, as demonstrated through comparisons with other functional classifiers in simulations and various data applications, including spectral data, functional magnetic resonance imaging data for attention deficit hyperactivity disorder patients, and yeast gene expression data. Theoretical support and a similar notion of “perfect classification” standing for asymptotically vanishing misclassification probabilities has also been discussed separately for linear and quadratic functional classification in some other research works cited in literature.