Dynamic topic model

Within statistics, Dynamic topic models' are generative models that can be used to analyze the evolution of (unobserved) topics of a collection of documents over time. This family of models was proposed by David Blei and John Lafferty and is an extension to Latent Dirichlet Allocation (LDA) that can handle sequential documents.

In LDA, both the order the words appear in a document and the order the documents appear in the corpus are oblivious to the model. Whereas words are still assumed to be exchangeable, in a dynamic topic model the order of the documents plays a fundamental role. More precisely, the documents are grouped by time slice (e.g.: years) and it is assumed that the documents of each group come from a set of topics that evolved from the set of the previous slice.

Topics
Similarly to LDA and pLSA, in a dynamic topic model, each document is viewed as a mixture of unobserved topics. Furthermore, each topic defines a multinomial distribution over a set of terms. Thus, for each word of each document, a topic is drawn from the mixture and a term is subsequently drawn from the multinomial distribution corresponding to that topic.

The topics, however, evolve over time. For instance, the two most likely terms of a topic at time $t$ could be "network" and "Zipf" (in descending order) while the most likely ones at time $t+1$ could be "Zipf" and "percolation" (in descending order).

Model
Define
 * $$\alpha_t$$ as the per-document topic distribution at time t.
 * $$\beta_{t,k}$$ as the word distribution of topic k at time t.
 * $$\eta_{t,d}$$ as the topic distribution for document d in time t,
 * $$z_{t,d,n}$$ as the topic for the nth word in document d in time t, and
 * $$w_{t,d,n}$$ as the specific word.

In this model, the multinomial distributions $$\alpha_{t+1}$$ and $$\beta_{t+1,k}$$ are generated from $$\alpha_t$$ and $$\beta_{t,k}$$, respectively. Even though multinomial distributions are usually written in terms of the mean parameters, representing them in terms of the natural parameters is better in the context of dynamic topic models.

The former representation has some disadvantages due to the fact that the parameters are constrained to be non-negative and sum to one. When defining the evolution of these distributions, one would need to assure that such constraints were satisfied. Since both distributions are in the exponential family, one solution to this problem is to represent them in terms of the natural parameters, that can assume any real value and can be individually changed.

Using the natural parameterization, the dynamics of the topic model are given by
 * $$\beta_{t,k}|\beta_{t-1,k} \sim N(\beta_{t-1,k},\sigma^2 I)$$

and
 * $$\alpha_{t}|\alpha_{t-1} \sim N(\alpha_{t-1},\delta^2 I)$$.

The generative process at time slice 't' is therefore:
 * 1) Draw topics $$\beta_{t,k}|\beta_{t-1,k} \sim N(\beta_{t-1,k},\sigma^2 I) \forall k$$
 * 2) Draw mixture model $$\alpha_{t}|\alpha_{t-1} \sim N(\alpha_{t-1},\delta^2 I)$$
 * 3) For each document:
 * 4) Draw $$\eta_{t,d} \sim N(\alpha_t,a^2 I)$$
 * 5) For each word:
 * 6) Draw topic $$Z_{t,d,n} \sim \textrm{Mult}(\pi(\eta_{t,d}))$$
 * 7) Draw word $$W_{t,d,n} \sim \textrm{Mult}(\pi(\beta_{t,Z_{t,d,n}}))$$

where $$\pi(x)$$ is a mapping from the natural parameterization x to the mean parameterization, namely
 * $$\pi(x_i) = \frac{\exp(x_i)}{\sum_i \exp(x_i)}$$.

Inference
In the dynamic topic model, only $$W_{t,d,n}$$ is observable. Learning the other parameters constitutes an inference problem. Blei and Lafferty argue that applying Gibbs sampling to do inference in this model is more difficult than in static models, due to the nonconjugacy of the Gaussian and multinomial distributions. They propose the use of variational methods, in particular, the Variational Kalman Filtering and the Variational Wavelet Regression.

Applications
In the original paper, a dynamic topic model is applied to the corpus of Science articles published between 1881 and 1999 aiming to show that this method can be used to analyze the trends of word usage inside topics. The authors also show that the model trained with past documents is able to fit documents of an incoming year better than LDA.

A continuous dynamic topic model was developed by Wang et al. and applied to predict the timestamp of documents.

Going beyond text documents, dynamic topic models were used to study musical influence, by learning musical topics and how they evolve in recent history.