Dynamic mode decomposition

Dynamic mode decomposition (DMD) is a dimensionality reduction algorithm developed by Peter J. Schmid and Joern Sesterhenn in 2008. Given a time series of data, DMD computes a set of modes each of which is associated with a fixed oscillation frequency and decay/growth rate. For linear systems in particular, these modes and frequencies are analogous to the normal modes of the system, but more generally, they are approximations of the modes and eigenvalues of the composition operator (also called the Koopman operator). Due to the intrinsic temporal behaviors associated with each mode, DMD differs from dimensionality reduction methods such as principal component analysis, which computes orthogonal modes that lack predetermined temporal behaviors. Because its modes are not orthogonal, DMD-based representations can be less parsimonious than those generated by PCA. However, they can also be more physically meaningful because each mode is associated with a damped (or driven) sinusoidal behavior in time.

Overview
Dynamic mode decomposition was first introduced by Schmid as a numerical procedure for extracting dynamical features from flow data.

The data takes the form of a snapshot sequence
 * $$ V_1^N = \{v_1, v_2, \dots, v_N\}, $$

where $$ v_i\in \mathbb{R}^M$$ is the $$i$$-th snapshot of the flow field, and $$V_1^N\in\mathbb{R}^{M\times N}$$ is a data matrix whose columns are the individual snapshots. These snapshots are assumed to be related via a linear mapping that defines a linear dynamical system
 * $$ v_{i+1} = A v_i, $$

that remains approximately the same over the duration of the sampling period. Written in matrix form, this implies that
 * $$ V_{2}^N = A V_{1}^{N-1} + re_{N-1}^T,$$

where $$r$$ is the vector of residuals that accounts for behaviors that cannot be described completely by $$A$$, $$e_{N-1}=\{0,0,\ldots,1\}\in\mathbb{R}^{N-1}$$, $$V_1^{N-1}=\{v_1, v_2, \dots, v_{N-1}\}$$, and $$V_2^{N}=\{v_2, v_3, \dots, v_{N}\}$$. Regardless of the approach, the output of DMD is the eigenvalues and eigenvectors of $$A$$, which are referred to as the DMD eigenvalues and DMD modes respectively.

Algorithm
There are two methods for obtaining these eigenvalues and modes. The first is Arnoldi-like, which is useful for theoretical analysis due to its connection with Krylov methods. The second is a singular value decomposition (SVD) based approach that is more robust to noise in the data and to numerical errors.

The Arnoldi approach
In fluids applications, the size of a snapshot, $$M$$, is assumed to be much larger than the number of snapshots $$N$$, so there are many equally valid choices of $$A$$. The original DMD algorithm picks $$A$$ so that each of the snapshots in $$V_2^N$$ can be expressed as linear combinations of the snapshots in $$V_1^{N-1}$$. Because most of the snapshots appear in both data sets, this representation is error free for all snapshots except $$v_N$$, which is written as
 * $$ v_N = a_1 v_1 + a_2 v_2 + \dots + a_{N-1}v_{N-1} + r = V_1^{N-1}a + r,$$

where $$a={a_1, a_2, \dots, a_{N-1}}$$ is a set of coefficients DMD must identify and $$r$$ is the residual. In total,
 * $$ V_{2}^N = A V_1^{N-1} + re_{N-1}^T = V_1^{N-1} S + re_{N-1}^T, $$

where $$S$$ is the companion matrix
 * $$S=\begin{pmatrix}

0 & 0 & \dots & 0 & a_1 \\ 1 & 0 & \dots & 0 & a_2 \\ 0 & 1 & \dots & 0 & a_3 \\ \vdots & \vdots & \ddots & \vdots & \vdots \\ 0 & 0 & \dots & 1 & a_{N-1} \end{pmatrix}.$$ The vector $$a$$ can be computed by solving a least squares problem, which minimizes the overall residual. In particular if we take the QR decomposition of $$V_1^{N-1} = QR$$, then $$a = R^{-1}Q^Tv_N$$.

In this form, DMD is a type of Arnoldi method, and therefore the eigenvalues of $$S$$ are approximations of the eigenvalues of $$A$$. Furthermore, if $$y$$ is an eigenvector of $$S$$, then $$V_1^{N-1}y$$ is an approximate eigenvector of $$A$$. The reason an eigendecomposition is performed on $$S$$ rather than $$A$$ is because $$S$$ is much smaller than $$A$$, so the computational cost of DMD is determined by the number of snapshots rather than the size of a snapshot.

The SVD-based approach
Instead of computing the companion matrix $$ S $$, the SVD-based approach yields the matrix $$\tilde S$$ that is related to $$A$$ via a similarity transform. To do this, assume we have the SVD of $$V_1^{N-1} = U\Sigma W^T$$. Then
 * $$ V_{2}^N = A V_1^{N-1} + re_{N-1}^T = AU\Sigma W^T + re_{N-1}^T.$$

Equivalent to the assumption made by the Arnoldi-based approach, we choose $$A$$ such that the snapshots in $$V_2^N$$ can be written as the linear superposition of the columns in $$U$$, which is equivalent to requiring that they can be written as the superposition of POD modes. With this restriction, minimizing the residual requires that it is orthogonal to the POD basis (i.e., $$U^Tr = 0$$). Then multiplying both sides of the equation above by $$U^T$$ yields $$ U^TV_2^N = U^T A U\Sigma W^T $$, which can be manipulated to obtain
 * $$ U^T A U = U^TV_2^N W \Sigma^{-1} \equiv \tilde S. $$

Because $$A$$ and $$\tilde S$$ are related via similarity transform, the eigenvalues of $$S$$ are the eigenvalues of $$A$$, and if $$y$$ is an eigenvector of $$\tilde S$$, then $$Uy$$ is an eigenvector of $$A$$.

In summary, the SVD-based approach is as follows:
 * 1) Split the time series of data in  $$V_1^N$$ into the two matrices $$V_1^{N-1}$$ and $$V_2^N$$.
 * 2) Compute the SVD of $$V_1^{N-1} = U\Sigma W^T$$.
 * 3) Form the matrix $$\tilde S =  U^TV_2^N W \Sigma^{-1}$$, and compute its eigenvalues $$\lambda_i$$ and eigenvectors $$y_i$$.
 * 4) The $$i$$-th DMD eigenvalue is $$\lambda_i$$ and $$i$$-th DMD mode is the $$Uy_i$$.

The advantage of the SVD-based approach over the Arnoldi-like approach is that noise in the data and numerical truncation issues can be compensated for by truncating the SVD of $$V_1^{N-1}$$. As noted in accurately computing more than the first couple modes and eigenvalues can be difficult on experimental data sets without this truncation step.

Theoretical and algorithmic advancements
Since its inception in 2010, a considerable amount of work has focused on understanding and improving DMD. One of the first analyses of DMD by Rowley et al. established the connection between DMD and the Koopman operator, and helped to explain the output of DMD when applied to nonlinear systems. Since then, a number of modifications have been developed that either strengthen this connection further or enhance the robustness and applicability of the approach.


 * Optimized DMD: Optimized DMD is a modification of the original DMD algorithm designed to compensate for two limitations of that approach: (i) the difficulty of DMD mode selection, and (ii) the sensitivity of DMD to noise or other errors in the last snapshot of the time series. Optimized DMD recasts the DMD procedure as an optimization problem where the identified linear operator has a fixed rank.  Furthermore, unlike DMD which perfectly reproduces all of the snapshots except for the last, Optimized DMD allows the reconstruction errors to be distributed throughout the data set, which appears to make the approach more robust in practice.
 * Optimal Mode Decomposition: Optimal Mode Decomposition (OMD) recasts the DMD procedure as an optimization problem and allows the user to directly impose the rank of the identified system. Provided this rank is chosen properly, OMD can produce linear models with smaller residual errors and more accurate eigenvalues on both synthetic and experimental data sets.
 * Exact DMD: The Exact DMD algorithm generalizes the original DMD algorithm in two ways. First, in the original DMD algorithm the data must be a time series of snapshots, but Exact DMD accepts a data set of snapshot pairs.  The snapshots in the pair must be separated by a fixed $$\Delta t$$, but do not need to be drawn from a single time series.  In particular, Exact DMD can allow data from multiple experiments to be aggregated into a single data set.  Second, the original DMD algorithm effectively pre-processes the data by projecting onto a set of POD modes.  The Exact DMD algorithm removes this pre-processing step, and can produce DMD modes that cannot be written as the superposition of POD modes.
 * Sparsity Promoting DMD: Sparsity promoting DMD is a post processing procedure for DMD mode and eigenvalue selection. Sparsity promoting DMD uses an $\ell_1$ penalty to identify a smaller set of important DMD modes, and is an alternative approach to the DMD mode selection problem that can be solved efficiently using convex optimization techniques.
 * Multi-Resolution DMD: Multi-Resolution DMD (mrDMD) is a combination of the techniques used in multiresolution analysis with Exact DMD designed to robust extracting DMD modes and eigenvalues from data sets containing multiple timescales. The mrDMD approach was applied to global surface temperature data, and identifies a DMD mode that appears during  El Nino years.
 * Extended DMD: Extended DMD is a modification of Exact DMD that strengthens the connection between DMD and the Koopman operator. As the name implies, Extended DMD is an extension of DMD that uses a richer set of observable functions to produce more accurate approximations of the Koopman operator. This extended set could be chosen a priori or learned from data.  It also demonstrated the DMD and related methods produce approximations of the Koopman eigenfunctions in addition to the more commonly used eigenvalues and modes.
 * Residual DMD: Residual DMD provides a means to control the projection errors of DMD and Extended DMD that arise from finite-dimensional approximations of the Koopman operator. The method utilizes the same snapshot data but introduces an additional finite matrix that captures infinite-dimensional residuals exactly in the large data limit. This enables users to sidestep spectral pollution (spurious modes), verify Koopman mode decompositions and learned dictionaries, and compute continuous spectra. Moreover, the method further bolsters the link between DMD and the Koopman operator by demonstrating how the spectral content of the latter can be computed with verification and error control.
 * Physics-informed DMD: Physics-informed DMD forms a Procrustes problem that restricts the family of admissible models to a matrix manifold that respects the physical structure of the system. This allows physical structures to be incoorporated into DMD. This approach is less prone to overfitting, requires less training data, and is often less computationally expensive to build than standard DMD models.
 * Measure-preserving EDMD: Measure-preserving extended DMD (mpEDMD) offers a Galerkin method whose eigendecomposition converges to the spectral quantities of the Koopman operators for general measure-preserving dynamical systems. This method employs an orthogonal Procrustes problem (essentially a polar decomposition) to DMD and extended DMD. Beyond convergence, mpEDMD upholds physical conservation laws, and exhibits enhanced robustness to noise as well as improved long-term behavior.
 * DMD with Control: Dynamic mode decomposition with control (DMDc) is a modification of the DMD procedure designed for data obtained from input output systems. One unique feature of DMDc is the ability to disambiguate the effects of system actuation from the open loop dynamics, which is useful when data are obtained in the presence of actuation.
 * Total Least Squares DMD: Total Least Squares DMD is a recent modification of Exact DMD meant to address issues of robustness to measurement noise in the data. In, the authors interpret the Exact DMD as a regression problem that is solved using ordinary least squares (OLS), which assumes that the regressors are noise free.  This assumption creates a bias in the DMD eigenvalues when it is applied to experimental data sets where all of the observations are noisy.  Total least squares DMD replaces the OLS problem with a total least squares problem, which eliminates this bias.
 * Dynamic Distribution Decomposition: DDD focuses on the forward problem in continuous time, i.e., the transfer operator. However the method developed can also be used for fitting DMD problems in continuous time.

In addition to the algorithms listed here, similar application-specific techniques have been developed. For example, like DMD, Prony's method represents a signal as the superposition of damped sinusoids. In climate science, linear inverse modeling is also strongly connected with DMD. For a more comprehensive list, see Tu et al.

Trailing edge of a profile


The wake of an obstacle in the flow may develop a Kármán vortex street. The Fig.1 shows the shedding of a vortex behind the trailing edge of a profile. The DMD-analysis was applied to 90 sequential Entropy fields [[Media:Joukowsky Karman Vortex Street video.gif|(animated gif (1.9MB))]] and yield an approximated eigenvalue-spectrum as depicted below. The analysis was applied to the numerical results, without referring to the governing equations. The profile is seen in white. The white arcs are the processor boundaries since the computation was performed on a parallel computer using different computational blocks. Roughly a third of the spectrum was highly damped (large, negative $$\lambda_r$$) and is not shown. The dominant shedding mode is shown in the following pictures. The image to the left is the real part, the image to the right, the imaginary part of the eigenvector.

Again, the entropy-eigenvector is shown in this picture. The acoustic contents of the same mode is seen in the bottom half of the next plot. The top half corresponds to the entropy mode as above.

Synthetic example of a traveling pattern
The DMD analysis assumes a pattern of the form $$ q(x_1,x_2,x_3, \ldots)=e^ {c x_1 }\hat q(x_2,x_3,\ldots) $$ where $$ x_1 $$ is any of the independent variables of the problem, but has to be selected in advance. Take for example the pattern

q(x,y,t)=e^{-i \omega t} \hat q (x,t) e^{-(y/b)^2} \Re \left\{ e^{i (k x - \omega t)}    \right\} + \text{random noise} $$ With the time as the preselected exponential factor.

A sample is given in the following figure with $$ \omega = 2\pi /0.1 $$, $$ b=0.02 $$ and $$ k = 2\pi/ b $$. The left picture shows the pattern without, the right with noise added. The amplitude of the random noise is the same as that of the pattern.



A DMD analysis is performed with 21 synthetically generated fields using a time interval $$ \Delta t =1/90\text{ s}$$, limiting the analysis to $$ f =45\text{ Hz}$$.

The spectrum is symmetric and shows three almost undamped modes (small negative real part), whereas the other modes are heavily damped. Their numerical values are $$ \omega_1=-0.201, \omega_{2/3}=-0.223 \pm i 62.768$$ respectively. The real one corresponds to the mean of the field, whereas $$ \omega_{2/3}$$ corresponds to the imposed pattern with $$ f = 10\text{ Hz} $$. Yielding a relative error of −1/1000. Increasing the noise to 10 times the signal value yields about the same error. The real and imaginary part of one of the latter two eigenmodes is depicted in the following figure.