Inside–outside algorithm

For parsing algorithms in computer science, the inside–outside algorithm is a way of re-estimating production probabilities in a probabilistic context-free grammar. It was introduced by James K. Baker in 1979 as a generalization of the forward–backward algorithm for parameter estimation on hidden Markov models to stochastic context-free grammars. It is used to compute expectations, for example as part of the expectation–maximization algorithm (an unsupervised learning algorithm).

Inside and outside probabilities
The inside probability $$\beta_j(p,q)$$ is the total probability of generating words $$w_p \cdots w_q$$, given the root nonterminal $$N^j$$ and a grammar $$G$$:
 * $$\beta_j(p,q) = P(w_{pq}|N^j_{pq}, G)$$

The outside probability $$\alpha_j(p,q)$$ is the total probability of beginning with the start symbol $$N^1$$ and generating the nonterminal $$N^j_{pq}$$ and all the words outside $$w_p \cdots w_q$$, given a grammar $$G$$:
 * $$\alpha_j(p,q) = P(w_{1(p-1)}, N^j_{pq}, w_{(q+1)m}|G)$$

Computing inside probabilities
Base Case:

$$\beta_j(p,p) = P(w_{p}|N^j, G)$$

General case:

Suppose there is a rule $$ N_j \rightarrow N_r N_s $$ in the grammar, then the probability of generating $$w_p \cdots w_q $$ starting with a subtree rooted at $$N_j$$ is:

$$ \sum_{k=p}^{k=q-1} P(N_j \rightarrow N_r N_s)\beta_r(p,k) \beta_s(k+1,q) $$

The inside probability $$\beta_j(p,q) $$ is just the sum over all such possible rules:

$$ \beta_j(p,q) = \sum_{N_r,N_s} \sum_{k=p}^{k=q-1} P(N_j \rightarrow N_r N_s)\beta_r(p,k) \beta_s(k+1,q) $$

Computing outside probabilities
Base Case:

$$ \alpha_j(1,n) = \begin{cases} 1 & \mbox{if } j=1 \\ 0 & \mbox{otherwise} \end{cases} $$

Here the start symbol is $$N_1$$.

General case:

Suppose there is a rule $$ N_r \rightarrow N_j N_s $$ in the grammar that generates $$N_j$$. Then the left contribution of that rule to the outside probability $$\alpha_j(p,q)$$ is:

$$ \sum_{k=q+1}^{k=n} P(N_r \rightarrow N_j N_s)\alpha_r(p,k) \beta_s(q+1,k) $$

Now suppose there is a rule $$ N_r \rightarrow N_s N_j $$ in the grammar. Then the right contribution of that rule to the outside probability $$\alpha_j(p,q)$$ is:

$$ \sum_{k=1}^{k=p-1} P(N_r \rightarrow N_s N_j)\alpha_r(k,q) \beta_s(k,p-1) $$

The outside probability $$\alpha_j(p,q) $$ is the sum of the left and right contributions over all such rules:

$$ \alpha_j(p,q) = \sum_{N_r,N_s} \sum_{k=q+1}^{k=n} P(N_r \rightarrow N_j N_s)\alpha_r(p,k) \beta_s(q+1,k) + \sum_{N_r,N_s} \sum_{k=1}^{k=p-1} P(N_r \rightarrow N_s N_j)\alpha_r(k,q) \beta_s(k,p-1) $$