Substring index

In computer science, a substring index is a data structure which gives substring search in a text or text collection in sublinear time. If you have a document $$S$$ of length $$n$$, or a set of documents $$D=\{S^1,S^2, \dots, S^d\}$$ of total length $$n$$, you can locate all occurrences of a pattern $$P$$ in $$o(n)$$ time. (See Big O notation.)

The phrase full-text index is also often used for an index of all substrings of a text. But this is ambiguous, as it is also used for regular word indexes such as inverted files and document retrieval. See full text search.

Substring indexes include:


 * Suffix tree
 * Suffix array
 * N-gram index, an inverted file for all N-grams of the text
 * Compressed suffix array
 * FM-index
 * LZ-index