Unavoidable pattern

In mathematics and theoretical computer science, a pattern is an unavoidable pattern if it is unavoidable on any finite alphabet.

Pattern
Like a word, a pattern (also called term) is a sequence of symbols over some alphabet.

The minimum multiplicity of the pattern $$p$$ is $$m(p)=\min(\mathrm{count_p}(x):x \in p)$$ where $$\mathrm{count_p}(x)$$ is the number of occurrence of symbol $$x$$ in pattern $$p$$. In other words, it is the number of occurrences in $$p$$ of the least frequently occurring symbol in $$p$$.

Instance
Given finite alphabets $$\Sigma$$ and $$\Delta$$, a word $$x\in\Sigma^*$$ is an instance of the pattern $$p\in\Delta^*$$ if there exists a non-erasing semigroup morphism $$f:\Delta^*\rightarrow\Sigma^*$$ such that $$f(p)=x$$, where $$\Sigma^*$$ denotes the Kleene star of $$\Sigma$$. Non-erasing means that $$f(a)\neq\varepsilon$$ for all $$a\in\Delta$$, where $$\varepsilon$$ denotes the empty string.

Avoidance / Matching
A word $$w$$ is said to match, or encounter, a pattern $$p$$ if a factor (also called subword or substring) of $$w$$ is an instance of $$p$$. Otherwise, $$w$$ is said to avoid $$p$$, or to be $$p$$-free. This definition can be generalized to the case of an infinite $$w$$, based on a generalized definition of "substring".

Avoidability / Unavoidability on a specific alphabet
A pattern $$p$$ is unavoidable on a finite alphabet $$\Sigma$$ if each sufficiently long word $$x\in\Sigma^*$$ must match $$p$$; formally: if $$\exists n\in \Nu. \ \forall x\in\Sigma^*. \ (|x|\geq n \implies x \text{ matches } p)$$. Otherwise, $$p$$ is avoidable on $$\Sigma$$, which implies there exist infinitely many words over the alphabet $$\Sigma$$ that avoid $$p$$.

By Kőnig's lemma, pattern $$p$$ is avoidable on $$\Sigma$$ if and only if there exists an infinite word $$w\in \Sigma^\omega$$ that avoids $$p$$.

Maximal p-free word
Given a pattern $$p$$ and an alphabet $$\Sigma$$. A $$p$$-free word $$w\in\Sigma^*$$ is a maximal $$p$$-free word over $$\Sigma$$ if $$aw$$ and $$wa$$ match $$p$$ $$\forall a\in\Sigma$$.

Avoidable / Unavoidable pattern
A pattern $$p$$ is an unavoidable pattern (also called blocking term) if $$p$$ is unavoidable on any finite alphabet.

If a pattern is unavoidable and not limited to a specific alphabet, then it is unavoidable for any finite alphabet by default. Conversely, if a pattern is said to be avoidable and not limited to a specific alphabet, then it is avoidable on some finite alphabet by default.

k-avoidable / k-unavoidable
A pattern $$p$$ is $$k$$-avoidable if $$p$$ is avoidable on an alphabet $$\Sigma$$ of size $$k$$. Otherwise, $$p$$ is $$k$$-unavoidable, which means $$p$$ is unavoidable on every alphabet of size $$k$$.

If pattern $$p$$ is $$k$$-avoidable, then $$p$$ is $$g$$-avoidable for all $$g\geq k$$.

Given a finite set of avoidable patterns $$S=\{p_1,p_2,...,p_i \}$$, there exists an infinite word $$w\in\Sigma^\omega$$ such that $$w$$ avoids all patterns of $$S$$. Let $$\mu(S)$$ denote the size of the minimal alphabet $$\Sigma'$$such that $$\exist w' \in {\Sigma'}^\omega$$ avoiding all patterns of $$S$$.

Avoidability index
The avoidability index of a pattern $$p$$ is the smallest $$k$$ such that $$p$$ is $$k$$-avoidable, and $$\infin$$ if $$p$$ is unavoidable.

Properties

 * A pattern $$q$$ is avoidable if $$q$$ is an instance of an avoidable pattern $$p$$.
 * Let avoidable pattern $$p$$ be a factor of pattern $$q$$, then $$q$$ is also avoidable.
 * A pattern $$q$$ is unavoidable if and only if $$q$$ is a factor of some unavoidable pattern $$p$$.
 * Given an unavoidable pattern $$p$$ and a symbol $$a$$ not in $$p$$, then $$pap$$ is unavoidable.
 * Given an unavoidable pattern $$p$$, then the reversal $$p^R$$ is unavoidable.
 * Given an unavoidable pattern $$p$$, there exists a symbol $$a$$ such that $$a$$ occurs exactly once in $$p$$.
 * Let $$n\in\Nu$$ represent the number of distinct symbols of pattern $$p$$. If $$|p|\geq2^n$$, then $$p$$ is avoidable.

Zimin words
Given alphabet $$\Delta=\{x_1,x_2,...\}$$, Zimin words (patterns) are defined recursively $$Z_{n+1}=Z_nx_{n+1}Z_n$$ for $$n\in\Zeta^+$$ and $$Z_1=x_1$$.

Unavoidability
All Zimin words are unavoidable.

A word $$w$$ is unavoidable if and only if it is a factor of a Zimin word.

Given a finite alphabet $$\Sigma$$, let $$f(n,|\Sigma|)$$ represent the smallest $$m\in\Zeta^+$$ such that $$w$$ matches $$Z_n$$ for all $$w\in \Sigma^m$$. We have following properties:


 * $$f(1,q)=1$$
 * $$f(2,q)=2q+1$$
 * $$f(3,2)=29$$
 * $$f(n,q)\leq{^{n-1}q}=\underbrace{q^{q^{\cdot^{\cdot^{q}}}}}_{n-1} $$

$$Z_n$$ is the longest unavoidable pattern constructed by alphabet $$\Delta=\{x_1,x_2,...,x_n\}$$ since $$|Z_n|=2^n -1$$.

Free letter
Given a pattern $$p$$ over some alphabet $$\Delta$$, we say $$x\in\Delta$$ is free for $$p$$ if there exist subsets $$A,B$$ of $$\Delta$$ such that the following hold:


 * 1) $$uv$$ is a factor of $$p$$ and $$u\in A$$ ↔ $$uv$$ is a factor of $$p$$ and $$v\in B$$
 * 2) $$x\in A\backslash B\cup B\backslash A$$

For example, let $$p=abcbab$$, then $$b$$ is free for $$p$$ since there exist $$A=ac,B=b$$ satisfying the conditions above.

Reduce
A pattern $$p\in \Delta^*$$ reduces to pattern $$q$$ if there exists a symbol $$x\in \Delta$$ such that $$x$$ is free for $$p$$, and $$q$$ can be obtained by removing all occurrence of $$x$$ from $$p$$. Denote this relation by $$p\stackrel{x}{\rightarrow}q$$.

For example, let $$p=abcbab$$, then $$p$$ can reduce to $$q=aca$$ since $$b$$ is free for $$p$$.

Locked
A word $$w$$ is said to be locked if $$w$$ has no free letter; hence $$w$$ can not be reduced.

Transitivity
Given patterns $$p,q,r$$, if $$p$$ reduces to $$q$$ and $$q$$ reduces to $$r$$, then $$p$$ reduces to $$r$$. Denote this relation by $$p\stackrel{*}{\rightarrow}r$$.

Unavoidability
A pattern $$p$$ is unavoidable if and only if $$p$$ reduces to a word of length one; hence $$\exist w$$ such that $$|w|=1$$ and $$p\stackrel{*}{\rightarrow}w$$.

== Graph pattern avoidance ==

Avoidance / Matching on a specific graph
Given a simple graph $$G=(V,E)$$, a edge coloring $$c:E\rightarrow \Delta$$ matches pattern $$p$$ if there exists a simple path $$P=[e_1,e_2,...,e_r]$$ in $$G$$ such that the sequence $$c(P)=[c(e_1),c(e_2),...,c(e_r)]$$ matches $$p$$. Otherwise, $$c$$ is said to avoid $$p$$ or be $$p$$-free.

Similarly, a vertex coloring $$c:V\rightarrow \Delta$$ matches pattern $$p$$ if there exists a simple path $$P=[c_1,c_2,...,c_r]$$ in $$G$$ such that the sequence $$c(P)$$ matches $$p$$.

Pattern chromatic number
The pattern chromatic number $$\pi_p(G)$$ is the minimal number of distinct colors needed for a $$p$$-free vertex coloring $$c$$ over the graph $$G$$.

Let $$\pi_p(n)=\max\{\pi_p(G):G\in G_n\}$$ where $$G_n$$ is the set of all simple graphs with a maximum degree no more than $$n$$.

Similarly, $$\pi_p'(G)$$ and $$\pi_p'(n)$$ are defined for edge colorings.

Avoidability / Unavoidability on graphs
A pattern $$p$$ is avoidable on graphs if $$\pi_p(n)$$ is bounded by $$c_p$$, where $$c_p$$ only depends on $$p$$.


 * Avoidance on words can be expressed as a specific case of avoidance on graphs; hence a pattern $$p$$ is avoidable on any finite alphabet if and only if $$\pi_p(P_n) \leq c_p$$ for all $$n \in\Zeta ^+ $$, where $$P_n$$ is a graph of $$n$$ vertices concatenated.

Probabilistic bound on $$\pi_p(n)$$
There exists an absolute constant $$c$$, such that $$\pi_p(n)\leq cn^{\frac{m(p)}{m(p)-1}}\leq cn^2$$ for all patterns $$p$$ with $$m(p)\geq2$$.

Given a pattern $$p$$, let $$n$$ represent the number of distinct symbols of $$p$$. If $$|p|\geq2^n$$, then $$p$$ is avoidable on graphs.

Explicit colorings
Given a pattern $$p$$ such that $$count_p(x)$$ is even for all $$x\in p$$, then $$\pi_p'(K_2^k)\leq2^k-1$$ for all $$k\geq 1$$, where $$K_n$$ is the complete graph of $$n$$ vertices.

Given a pattern $$p$$ such that $$m(p)\geq2$$, and an arbitrary tree $$T$$, let $$S$$ be the set of all avoidable subpatterns and their reflections of $$p$$. Then $$\pi_p(T)\leq 3\mu(S)$$.

Given a pattern $$p$$ such that $$m(p)\geq2$$, and a tree $$T$$ with degree $$n\geq2$$. Let $$S$$ be the set of all avoidable subpatterns and their reflections of $$p$$, then $$\pi_p'(T)\leq 2(n-1)\mu(S)$$.

Examples

 * The Thue–Morse sequence is cube-free and overlap-free; hence it avoids the patterns $$xxx$$ and $$xyxyx$$.
 * A square-free word is one avoiding the pattern $$xx$$. The word over the alphabet $$\{0,\pm1\}$$ obtained by taking the first difference of the Thue–Morse sequence is an example of an infinite square-free word.
 * The patterns $$x$$ and $$xyx$$ are unavoidable on any alphabet, since they are factors of the Zimin words.
 * The power patterns $$x^n$$ for $$n\geq 3 $$ are 2-avoidable.
 * All binary patterns can be divided into three categories:
 * $$\varepsilon,x,xyx$$ are unavoidable.
 * $$xx,xxy,xyy,xxyx,xxyy,xyxx,xyxy,xyyx,xxyxx,xxyxy,xyxyy$$ have avoidability index of 3.
 * others have avoidability index of 2.
 * $$abwbaxbcyaczca$$ has avoidability index of 4, as well as other locked words.
 * $$abvacwbaxbcycdazdcd$$ has avoidability index of 5.
 * The repetitive threshold $$RT(n)$$ is the infimum of exponents $$k$$ such that $$x^k$$ is avoidable on an alphabet of size $$n$$. Also see Dejean's theorem.

Open problems

 * Is there an avoidable pattern $$p$$ such that the avoidability index of $$p$$ is 6?
 * Given an arbitrarily pattern $$p$$, is there an algorithm to determine the avoidability index of $$p$$?