User:Elise-Jonsson-uu/sandbox

= Sparse identification of non-linear dynamics = Sparse identification of nonlinear dynamics (SINDy) is a data-driven algorithm for obtaining dynamical systems from sparse data. Given a series of snapshots of a dynamical system and its corresponding time derivatives, SINDy performs a sparsity-promoting regression (such as LASSO) on a library of nonlinear candidate functions against the snapshots to find the governing equations. This procedure relies on the assumption that most physical systems only have a few dominant terms which dictate the dynamics, given an appropriately selected coordinate system and quality data.

Mathematical Overview
First, consider a dynamical system of the form

$$\dot{\textbf{x}}=\frac{d}{dt}\textbf{x}(t)=\textbf{f}(\textbf{x}(t)),$$

where $$\textbf{x}(t)\in\mathbb{R}^n$$ is a state vector (snapshot) of the system at time $$t$$ and the function $$\textbf{f}(\textbf{x}(t))$$ defines the equations of motion and constraints of the system. The time derivative may be either prescribed or numerically approximated from the snapshots.

With $$\textbf{x}$$ and $$\dot{\textbf{x}}$$ sampled at $$m$$ equidistant points in time ($$t_1,t_2,\cdots,t_m$$), these can be arranged into matrices of the form

$$\bf{X}=\begin{bmatrix} \bf{x}^T(t_1) \\ \bf{x}^T(t_2) \\ \vdots \\ \bf{x}^T(t_m) \end{bmatrix} = \begin{bmatrix}x_1(t_1)&x_2(t_1)&\cdots&x_n(t_1)\\ x_1(t_2)&x_2(t_2)&\cdots&x_n(t_2)\\ \vdots&\vdots&\ddots&\vdots \\ x_1(t_m)&x_2(t_m)&\cdots&x_n(t_m) \end{bmatrix},$$

and similarly for $$\dot{\textbf{X}}$$.

Next, a library $$\bf{\Theta}(\textbf{X})$$ of nonlinear candidate functions of the columns of $$\textbf{X}$$ is constructed, which may be constant, polynomial, or more exotic functions (like trigonometric and rational terms, and so on):

$$\ \ \ \bf{\Theta}(\bf{X})=\begin{bmatrix} \vline&\vline&\vline&\vline& &\vline&\vline& \\ 1&\bf{X}&\bf{X}^2&\bf{X}^3&\cdots & \sin(\bf{X})&\cos(\bf{X})&\cdots\\ \vline&\vline&\vline&\vline& &\vline&\vline& \end{bmatrix}$$

The number of possible model structures from this library is combinatorically high. $$\textbf{f}(\textbf{x}(t))$$ is then substituted by $$\bf{\Theta}(\textbf{X})$$ and a vector of coefficients $$\bf{\Xi}=\left[\bf{\xi}_1 \bf{\xi}_2 \cdots \bf{\xi}_n \right]$$ determining the active terms in $$\textbf{f}(\textbf{x}(t))$$.

Because only a few terms are expected to be active at each point in time, an assumption is made that $$\textbf{f}(\textbf{x}(t))$$ admits a sparse representation in $$\bf{\Theta}(\textbf{X})$$. This then becomes an optimization problem in finding a sparse $$\bf{\Xi}$$ which optimally embeds $$\dot{\textbf{X}}$$. In other words, a parsimonious model is obtained by performing least squares regression on the system $$ with sparsity-promoting ($$L_1$$) regularization:

Finally, the sparse set of $$\bf{\xi}_k$$ can be used to reconstruct the dynamical system:

$$\dot{x}_k=\bf{\Theta}(\bf{x})\bf{\xi}_k$$