Draft:Locally Recoverable Codes

Locally Recoverable Codes are a family of error correction codes that were introduced first by D. S. Papailiopoulos and A. G. Dimakis and have been widely studied in Information theory due to their applications related to Distributive and Cloud Storage Systems.

An $$[n, k, d, r]_{q}$$ LRC is an $$[n, k, d]_{q}$$ linear code such that there is a function $$f_{i}$$ that takes as input $$i$$ and a set of $$r$$ other coordinates of a codeword $$c = (c_{1}, \ldots, c_{n}) \in C$$ different from $$c_{i}$$, and outputs $$c_{i}$$.

Definition
Let $$C$$ be a $$[n, k, d]_{q}$$ linear code. For $$i \in \{1, \ldots, n\}$$, let us denote by $$r_{i}$$ the minimum number of other coordinates we have to look at to recover an erasure in coordinate $$i$$. The number $$r_{i}$$ is said to be the locality of the $$i$$-th coordinate of the code. The locality of the code is defined as $$r = \textrm{max}\{r_{i}|i \in \{1, \ldots, n\}\}$$

An $$[n, k, d, r]_{q}$$ locally recoverable code (LRC) is an $$[n, k, d]_{q}$$ linear code $$C \in \mathbb F_q^n$$ with locality $$r$$.

Let $$C$$ be an $$[n, k, d]_{q}$$-locally recoverable code. Then an erased component can be recovered linearly, i.e. for every $$i \in \{1, \ldots, n\}$$, the space of linear equations of the code contains elements of the form $$ x_{i} = f(x_{i_{1}}, \ldots, x_{i_{r}})$$, where $$i_{j} \neq i$$.

Optimal Locally Recoverable Codes
Theorem Let $$n = (r+1)s$$ and let $$C$$ be an $$[n, k, d]_{q}$$-locally recoverable code having $$s$$ disjoint locality sets of size $$r+1$$. Then $$d \leq n - k - \left\lceil\frac{k}{r}\right\rceil + 2$$

An $$[n, k, d, r]_{q}$$-LRC $$C$$ is said to be optimal if the minimum distance of $$C$$ satisfies $$d = n - k - \left\lceil\frac{k}{r}\right\rceil + 2$$

Tamo--Barg Codes
Let $$f$$ ∈ $$\mathbb F_{q} [x]$$ be a polynomial and let $$\ell$$ be a positive integer. Then $$f$$ is said to be ($$r$$, $$\ell$$)-good if


 * • $$f$$ has degree $$r+1$$,


 * • there exist $$A_{1}$$, . . ., $$A_{\ell}$$ distinct subsets of $$\mathbb F_{q}$$ such that


 * – for any $$i \in \{1, \ldots, \ell\}$$, $$f$$ ($$A_{i}$$ ) = {$$t_{i}$$} for some $$t_{i}$$ ∈ $$\mathbb F_{q}$$, i.e. f is constant on $$A_{i}$$,


 * – #$$A_{i}$$ = $$r + 1$$,


 * – $$A_{i}$$ ∩ $$A_{j}$$ = ∅ for any $$i$$ ≠ $$j$$.

We say that {$$A_{1},\ldots,A_{\ell}$$} is a splitting covering for $$f$$.

Tamo--Barg Construction
The Tamo--Barg construction utilizes good polynomials.
 * • Suppose that a $$(r, \ell)$$-good polynomial $$f(x)$$ over $$\mathbb F_{q}$$ is given with splitting covering $$i \in \{1, \ldots, \ell\}$$.
 * • Let $$s$$ ≤ $$\ell-1$$ be a positive integer.
 * • Consider the following $$\mathbb F_{q}$$ -vector space of polynomials

$$V = \{\sum_{i=0}^s g_{i}(x)f(x)^i:{\text{deg}}(g_{i}(x)) \leq {\text{deg}}(f(x))-2\}$$
 * • Let $$T$$ = $\bigcup_{i=1}^\ell A_i$
 * • The code $$ \{ ev_{T}(g):g \in V \}$$ is an $$((r+1)\ell,(s+1)r,d,r)$$ optimal locally coverable code, where $$ev_{T}$$ denotes evaluation of g at all points in the set $$T$$.

Parameters of Tamo--Barg Codes

 * • Length. The length is the number of evaluation points. Because the sets $$A_i$$ are disjoint for $$i \in \{1, \ldots, \ell\}$$, the length of the code is $$|T| = (r+1)\ell$$.


 * • Dimension. The dimension of the code is $$(s+1)r$$, for $$s$$ ≤ $$\ell-1$$, as each $$g_i$$ has degree at most $${\text{deg}}(f(x))-2$$, covering a vector space of dimension $${\text{deg}}(f(x))-1=r$$, and by the construction of $$V$$, there are $$s+1$$ distinct $$g_i$$, each of which covering distinct vector spaces of dimension $$r$$.


 * • Distance. The distance is given by the fact that $$V \subseteq \mathbb F_q [x]_{\leq k}$$, where $$k = r + 1 - 2 + s(r+1)$$, and the obtained code is the Reed-Solomon code of degree at most $$k$$, so the distance equals $$(r+1)\ell-((r+1)-2+s(r+1))$$.


 * • Locality. After the erasure of the single component, the evaluation at $$a_i \in A_i$$, where $$|A_i|=r+1$$, is unknown, but the evaluations for all other $$a \in A_i$$ are known, so at most $$r$$ evaluations are needed to uniquely determine the erased component, which gives us the locality of $$r$$.


 * Now, $$g$$ restricted to $$A_1$$ can be described by a polynomial $$h$$ of degree at most $${\text{deg}}(f(x))-2 = r + 1 - 2 = r - 1$$ thanks to the form of the elements in $$V$$ (i.e. thanks to the fact that $$f$$ is constant on $$A_1$$, and the $$g_i$$'s have degree at most $${\text{deg}}(f(x))-2$$). On the other hand $$|A_1 \backslash \{a_1\}| = r$$, and $$r$$ evaluations uniquely determine a polynomial of degree $$r-1$$. Therefore $$h$$ can be constructed and evaluated at $$a_1$$ to recover $$g(a_1)$$.

Locally Recoverable Codes with Availability
Definition A code $$C$$ has all-symbol locality $$r$$ and availability $$t$$ if every code symbol can be recovered from $$t$$ disjoint repair sets of other symbols, each set of size at most $$r$$ symbols. Such codes are called $$(r,t)_a$$-LRC.

Theorem The minimum distance of $$[n,k,d]_q$$-LRC having locality $$r$$ and availability $$t$$ satisfies the upper bound

$$d \leq n - \sum_{i=0}^{t} \left\lfloor\frac{k-1}{r^i}\right\rfloor$$.

If the code is systematic and locality and availability apply only to its information symbols, then the code has information locality $$r$$ and availability $$t$$, and is called $$(r,t)_i$$-LRC.

Theorem The minimum distance d of an $$[n,k,d]_q$$ linear $$(r,t)_i$$-LRC satisfies the upper bound

$$d \leq n-k-\left\lceil\frac{t(k-1)+1}{t(r-1)+1}\right\rceil+2$$.