User:AryamanA/Transition-based parsing

Transition-based parsing is an algorithmic paradigm in the field of computational linguistics for learning automatic parsers for dependency grammars, including for syntactic and semantic parsing. It extends on shift-reduce parsing, but adapted to the complexities of parsing natural language. Several algorithms exist in this family, all sharing the fundamental approach of building up the parse tree step-by-step by parsing tokens from left to right.

Transition-based parsing was first proposed by Joakim Nivre in 2003. His original algorithm only supported projective dependency trees, and is now referred to as the arc-standard algorithm.

Algorithm
A transition-based parser, at every parser state, maintains $$S$$ (a stack or buffer of input tokens undergoing processing), $$I$$ (the list of remaining input tokens), and $$A$$ (the current set of arcs created by the parser). The parser is initialized with an empty stack and set of arcs, and terminates when the input list is empty.

Transitions
Arc-standard parsing implements the following transitions, described in pseudocode:

Left-Arc
 * Conditions: ¬∃m: (m → n) ∈ A, Lex(n') → Lex(n) ∈ parser grammar
 * n' ← next input token in I
 * n ← top of the stack S
 * add the arc (n' → n) to A
 * pop n from the stack S

Right-Arc
 * Conditions: ¬∃m: (m → n') ∈ A, Lex(n) → Lex(n') ∈ parser grammar
 * n' ← next input token in I
 * n ← top of the stack S
 * add the arc (n → n') to A
 * push n' onto the stack S

Reduce
 * pop the top node on the stack S

Shift
 * n' ← next input token in I
 * push n' onto the stack S