Higman's lemma

In mathematics, Higman's lemma states that the set $$\Sigma^*$$ of finite sequences over a finite alphabet $$\Sigma$$, as partially ordered by the subsequence relation, is well-quasi-ordered. That is, if $$w_1, w_2, \ldots\in \Sigma^*$$ is an infinite sequence of words over a finite alphabet $$\Sigma$$, then there exist indices $$i < j$$ such that $$w_i$$ can be obtained from $$w_j$$ by deleting some (possibly none) symbols. More generally this remains true when $$\Sigma$$ is not necessarily finite, but is itself well-quasi-ordered, and the subsequence relation is generalized into an "embedding" relation that allows the replacement of symbols by earlier symbols in the well-quasi-ordering of $$\Sigma$$. This is a special case of the later Kruskal's tree theorem. It is named after Graham Higman, who published it in 1952.

Proof
Let $$\Sigma$$ be a well-quasi-ordered alphabet of symbols (in particular, $$\Sigma$$ could be finite and ordered by the identity relation). Suppose for a contradiction that there exist infinite bad sequences, i.e. infinite sequences of words $$w_1, w_2, w_3, \ldots\in \Sigma^*$$ such that no $$w_i$$ embeds into a later $$w_j$$. Then there exists an infinite bad sequence of words $$W=(w_1, w_2, w_3, \ldots)$$ that is minimal in the following sense: $$w_1$$ is a word of minimum length from among all words that start infinite bad sequences; $$w_2$$ is a word of minimum length from among all infinite bad sequences that start with $$w_1$$; $$w_3$$ is a word of minimum length from among all infinite bad sequences that start with $$w_1,w_2$$; and so on. In general, $$w_i$$ is a word of minimum length from among all infinite bad sequences that start with $$w_1,\ldots,w_{i-1}$$.

Since no $$w_i$$ can be the empty word, we can write $$w_i = a_i z_i$$ for $$a_i\in\Sigma$$ and $$z_i\in\Sigma^*$$. Since $$\Sigma$$ is well-quasi-ordered, the sequence of leading symbols $$a_1, a_2, a_3, \ldots$$ must contain an infinite increasing sequence $$a_{i_1} \le a_{i_2} \le a_{i_3} \le \cdots$$ with $$i_1<i_2<i_3<\cdots$$.

Now consider the sequence of words $$w_1, \ldots, w_{{i_1}-1}, z_{i_1}, z_{i_2}, z_{i_3}, \ldots.$$ Because $$z_{i_1}$$ is shorter than $$w_{i_1} = a_{i_1} z_{i_1}$$, this sequence is "more minimal" than $$W$$, and so it must contain a word $$u$$ that embeds into a later word $$v$$. But $$u$$ and $$v$$ cannot both be $$w_j$$'s, because then the original sequence $$W$$ would not be bad. Similarly, it cannot be that $$u$$ is a $$w_j$$ and $$v$$ is a $$z_{i_k}$$, because then $$w_j$$ would also embed into $$w_{i_k}=a_{i_k}z_{i_k}$$. And similarly, it cannot be that $$u=z_{i_j}$$ and $$v=z_{i_k}$$, $$j<k$$, because then $$w_{i_j}=a_{i_j}z_{i_j}$$ would embed into $$w_{i_k}=a_{i_k}z_{i_k}$$. In every case we arrive at a contradiction.

Ordinal type
The ordinal type of $$\Sigma^*$$ is related to the ordinal type of $$\Sigma$$ as follows: $$o(\Sigma^*)=\begin{cases}\omega^{\omega^{o(\Sigma)-1}},&o(\Sigma) \text{ finite};\\ \omega^{\omega^{o(\Sigma)+1}},&o(\Sigma)=\varepsilon_\alpha+n \text{ for some }\alpha\text{ and some finite }n;\\ \omega^{\omega^{o(\Sigma)}},&\text{otherwise}.\end{cases}$$

Reverse-mathematical calibration
Higman's lemma has been reverse mathematically calibrated (in terms of subsystems of second-order arithmetic) as equivalent to $$ACA_0$$ over the base theory $$RCA_0$$.