Time-series segmentation

Time-series segmentation is a method of time-series analysis in which an input time-series is divided into a sequence of discrete segments in order to reveal the underlying properties of its source. A typical application of time-series segmentation is in speaker diarization, in which an audio signal is partitioned into several pieces according to who is speaking at what times. Algorithms based on change-point detection include sliding windows, bottom-up, and top-down methods. Probabilistic methods based on hidden Markov models have also proved useful in solving this problem.

Overview of the segmentation problem
It is often the case that a time-series can be represented as a sequence of discrete segments of finite length. For example, the trajectory of a stock market could be partitioned into regions that lie in between important world events, the input to a handwriting recognition application could be segmented into the various words or letters that it was believed to consist of, or the audio recording of a conference could be divided according to who was speaking when. In the latter two cases, one may take advantage of the fact that the label assignments of individual segments may repeat themselves (for example, if a person speaks at several separate occasions during a conference) by attempting to cluster the segments according to their distinguishing properties (such as the spectral content of each speaker's voice). There are two general approaches to this problem. The first involves looking for change points in the time-series: for example, one may assign a segment boundary whenever there is a large jump in the average value of the signal. The second approach involves assuming that each segment in the time-series is generated by a system with distinct parameters, and then inferring the most probable segment locations and the system parameters that describe them. While the first approach tends to only look for changes in a short window of time, the second approach generally takes into account the entire time-series when deciding which label to assign to a given point.

Hidden Markov Models
Under the hidden Markov model, the time-series $$\boldsymbol{y}_{1:T} = (\boldsymbol{y}_1, ..., \boldsymbol{y}_T)$$ is assumed to have been generated as the system transitions among a set of discrete, hidden states $$z \in \{1, 2, ..., n\}$$. At each time $$t$$, a sample $$\boldsymbol{y}_t$$ is drawn from an observation (or emission) distribution indexed by the current hidden state, i.e., $$\boldsymbol{y}_t \sim P_{z_t}(\boldsymbol{y}_t)$$. The goal of the segmentation problem is to infer the hidden state at each time, as well as the parameters describing the emission distribution associated with each hidden state. Hidden state sequence and emission distribution parameters can be learned using the Baum-Welch algorithm, which is a variant of expectation maximization applied to HMMs. Typically in the segmentation problem self-transition probabilities among states are assumed to be high, such that the system remains in each state for nonnegligible time. More robust parameter-learning methods involve placing hierarchical Dirichlet process priors over the HMM transition matrix.