Tree-walking automaton

A tree-walking automaton (TWA) is a type of finite automaton that deals with tree structures rather than strings. The concept was originally proposed by Aho and Ullman.

The following article deals with tree-walking automata. For a different notion of tree automaton, closely related to regular tree languages, see branching automaton.

Definition
All trees are assumed to be binary, with labels from a fixed alphabet Σ.

Informally, a tree-walking automaton (TWA) A is a finite state device that walks over an input tree in a sequential manner. At each moment A visits a node v in state q. Depending on the state q, the label of the node v, and whether the node is the root, a left child, a right child or a leaf, A changes its state from q to q' and moves to the parent of v or its left or right child. A TWA accepts a tree if it enters an accepting state, and rejects if its enters a rejecting state or makes an infinite loop. As with string automata, a TWA may be deterministic or nondeterministic.

More formally, a (nondeterministic) tree-walking automaton over an alphabet Σ is a tuple A = (Q, Σ, I, F, R, δ) where Q is a finite set of states, its subsets I, F, and R are the sets of initial, accepting and rejecting states, respectively, and δ ⊆ (Q × { root, left, right, leaf } × Σ × { up, left, right } × Q) is the transition relation.

Example
A simple example of a tree-walking automaton is a TWA that performs depth-first search (DFS) on the input tree. The automaton $$A$$ has three states, $$Q = \{ q_{0}, q_{\mathit{left}}, q_{\mathit{right}} \}$$. $$A$$ begins in the root in state $$q_{0}$$ and descends to the left subtree. Then it processes the tree recursively. Whenever $$A$$ enters a node $$v$$ in state $$q_{\mathit{left}}$$, it means that the left subtree of $$v$$ has just been processed, so it proceeds to the right subtree of $$v$$. If $$A$$ enters a node $$v$$ in state $$q_{\mathit{right}}$$, it means that the whole subtree with root $$v$$ has been processed and $$A$$ walks to the parent of $$v$$ and changes its state to $$q_{\mathit{left}}$$ or $$q_{\mathit{right}}$$, depending on whether $$v$$ is a left or right child.

Properties
Unlike branching automata, tree-walking automata are difficult to analyze: even simple properties are nontrivial to prove. The following list summarizes some known facts related to TWA:
 * As shown by Bojańczyk and Colcombet, deterministic TWA are strictly weaker than nondeterministic ones ($$\mathit{DTWA} \subsetneq \mathit{TWA}$$)
 * Deterministic TWA are closed under complementation (but it is not known whether the same holds for nondeterministic ones )
 * The set of languages recognized by TWA is strictly contained in regular tree languages ($$\mathit{TWA} \subsetneq \mathit{REG}$$), i.e. there exist regular languages that are not recognized by any tree-walking automaton, see Bojańczyk and Colcombet.