User:Bydottck13/sandbox

Sparse Fourier Transform (SFT) is a kind of discrete Fourier transform (DFT) for handling with Big data signals. Specifically, it is used in GPS synchronization, spectrum sensing and analog-to-digital converters.

The fast Fourier transform (FFT) plays an indispensable role on many scientific domains, especially on signal processing. However, with the advent of big data era, the FFT still needs to be improved in order to save more computing power. Recently, the Sparse Fourier Transform (SFT) has gained a considerable amount of attention, for it performs well on analyzing the long sequence of data with few signal components.

Definition
Let a sequence xn which are complex numbers, by Fourier series, xn can be written as

x_n=(F^*X)_n=\sum_{k=0}^{N-1}X_k e^{j\frac{2\pi}{N}kn}. $$ Similarly, Xk can be represented as

X_k=\frac{1}{N}(Fx)_k=\frac{1}{N}\sum_{k=0}^{N-1}x_n e^{-j\frac{2\pi}{N}kn}. $$ Hence, from the equations above, the mapping turns out $$F:C^N\to C^N$$.

Single Frequency Recovery
Now, we assume there is only a single frequency exists in the sequence. In order to recover this frequency from the sequence, it is decent to utilize the relationship between adjacent points of the sequence.

Phase Encoding
The phase k can be obtained by dividing the adjacent points of the sequence. In other words,

\frac{x_{n+1}}{x_n}=e^{j\frac{2\pi}{N}k}=cos(\frac{2\pi k}{N})+j\cdot sin(\frac{2\pi k}{N}). $$ Notice that $$x_n \in C^N$$.

An Aliasing-based Search
Seeking phase k can be done by Chinese remainder theorem (CRT).

Take $$k=104,134$$ for an example. Now, we have three relatively prime integers 100, 101 and 103. Thus, the equation can be described as

k=104,134\equiv 34 \text{ mod } 100 \equiv 3 \text{ mod } 101\equiv 1 \text{ mod } 103. $$ By CRT, we have

k=104,134\text{ mod } (100\cdot101\cdot103)=104,134\text{ mod } 1,040,300 $$

Randomly Binning Frequencies
Now, we desire to explore the case of multiple frequencies, instead of a single frequency. The adjacent frequencies can be separated by the scaling c and modulation b properties. Namely, by randomly choosing the parameters of c and b, the distribution of all frequencies can be almost a uniform distribution. The figure Spread all frequencies reveals by randomly binning frequencies, we can utilize the single frequency recovery to seek the main components.



x_n'=X_k e^{j\frac{2\pi}{N}(c\cdot k+b)}, $$ where c is scaling property and b is modulation property.

By randomly choosing c and b, the whole spectrum can be looked like uniform distribution. Then, taking them into filter banks can separate all frequencies, including Gaussians, indicator functions , spike trains    , and Dolph-Chebyshev filters. Each bank only contains a single frequency.

The Prototypical SFT
Generally, all SFT follows the three stages :

Identifying Frequencies
By randomly bining frequencies, all components can be separated. Then, taking them into filter banks, so each band only contains a single frequency. It is convenient to use the methods we mentioned to recover this signal frequency.

Estimating Coefficients
After identifying frequencies, we will have many frequency components. We can use Fourier transform to estimate their coefficients.

X_k'=\frac{1}{L}\sum_{l=1}^{L}x_n'e^{-j\frac{2\pi}{N}n'l} $$

Repeating
Finally, repeating these two stages can we extract the most important components from the original signal.

x_n-\sum_{k'=1}^{k}X_k' e^{j\frac{2\pi}{N}k'n} $$

Implementations
There are several works based on MIT and ETH. Also, they are free online.
 * ETH implementations
 * MIT implementations
 * GitHub