Poly1305

Poly1305 is a universal hash family designed by Daniel J. Bernstein for use in cryptography.

As with any universal hash family, Poly1305 can be used as a one-time message authentication code to authenticate a single message using a secret key shared between sender and recipient, similar to the way that a one-time pad can be used to conceal the content of a single message using a secret key shared between sender and recipient.

Originally Poly1305 was proposed as part of Poly1305-AES, a Carter–Wegman authenticator that combines the Poly1305 hash with AES-128 to authenticate many messages using a single short key and distinct message numbers. Poly1305 was later applied with a single-use key generated for each message using XSalsa20 in the NaCl crypto_secretbox_xsalsa20poly1305 authenticated cipher, and then using ChaCha in the ChaCha20-Poly1305 authenticated cipher deployed in TLS on the internet.

Definition of Poly1305
Poly1305 takes a 16-byte secret key $$r$$ and an $$L$$-byte message $$m$$ and returns a 16-byte hash $$\operatorname{Poly1305}_r(m)$$. To do this, Poly1305:


 * 1) Interprets $$r$$ as a little-endian 16-byte integer.
 * 2) Breaks the message $$m = (m[0], m[1], m[2], \dotsc, m[L - 1])$$ into consecutive 16-byte chunks.
 * 3) Interprets the 16-byte chunks as 17-byte little-endian integers by appending a 1 byte to every 16-byte chunk, to be used as coefficients of a polynomial.
 * 4) Evaluates the polynomial at the point $$r$$ modulo the prime $$2^{130} - 5$$.
 * 5) Reduces the result modulo $$2^{128}$$ encoded in little-endian return a 16-byte hash.

The coefficients $$c_i$$ of the polynomial $$c_1 r^q + c_2 r^{q - 1} + \cdots + c_q r$$, where $$q = \lceil L/16\rceil$$, are:

$$ c_i = m[16i - 16] + 2^8 m[16i - 15] + 2^{16} m[16i - 14] + \cdots + 2^{120} m[16i - 1] + 2^{128}, $$

with the exception that, if $$L \not\equiv 0 \pmod{16}$$, then:

$$ c_q = m[16q - 16] + 2^8 m[16q - 15] + \cdots + 2^{8(L \bmod 16) - 8} m[L - 1] + 2^{8 (L \bmod 16)}. $$

The secret key $$r = (r[0], r[1], r[2], \dotsc, r[15])$$ is restricted to have the bytes $$r[3], r[7], r[11], r[15] \in \{0,1,2,\dotsc,15\}$$, i.e., to have their top four bits clear; and to have the bytes $$r[4], r[8], r[12] \in \{0,4,8,\dotsc,252\}$$, i.e., to have their bottom two bits clear. Thus there are $$2^{106}$$ distinct possible values of $$r$$.

Use as a one-time authenticator
If $$s$$ is a secret 16-byte string interpreted as a little-endian integer, then

$$ a := \bigl(\operatorname{Poly1305}_r(m) + s\bigr) \bmod 2^{128} $$

is called the authenticator for the message $$m$$. If a sender and recipient share the 32-byte secret key $$(r, s)$$ in advance, chosen uniformly at random, then the sender can transmit an authenticated message $$(a, m)$$. When the recipient receives an alleged authenticated message $$(a', m')$$ (which may have been modified in transmit by an adversary), they can verify its authenticity by testing whether

$$ a' \mathrel{\stackrel?=} \bigl(\operatorname{Poly1305}_r(m') + s\bigr) \bmod 2^{128}. $$ Without knowledge of $$(r, s)$$, the adversary has probability $$8\lceil L/16\rceil/2^{106}$$ of finding any $$(a', m') \ne (a, m)$$ that will pass verification.

However, the same key $$(r, s)$$ must not be reused for two messages. If the adversary learns

$$ \begin{align} a_1 &= \bigl(\operatorname{Poly1305}_r(m_1) + s\bigr) \bmod 2^{128}, \\ a_2 &= \bigl(\operatorname{Poly1305}_r(m_2) + s\bigr) \bmod 2^{128}, \end{align} $$

for $$m_1 \ne m_2$$, they can subtract

$$ a_1 - a_2 \equiv \operatorname{Poly1305}_r(m_1) - \operatorname{Poly1305}_r(m_2) \pmod{2^{128}} $$

and find a root of the resulting polynomial to recover a small list of candidates for the secret evaluation point $$r$$, and from that the secret pad $$s$$. The adversary can then use this to forge additional messages with high probability.

Use in Poly1305-AES as a Carter–Wegman authenticator
The original Poly1305-AES proposal uses the Carter–Wegman structure to authenticate many messages by taking $$a_i := H_r(m_i) + p_i$$ to be the authenticator on the $i$th message $$m_i$$, where $$H_r$$ is a universal hash family and $$p_i$$ is an independent uniform random hash value that serves as a one-time pad to conceal it. Poly1305-AES uses AES-128 to generate $$p_i := \operatorname{AES}_k(i)$$, where $$i$$ is encoded as a 16-byte little-endian integer.

Specifically, a Poly1305-AES key is a 32-byte pair $$(r, k)$$ of a 16-byte evaluation point $$r$$, as above, and a 16-byte AES key $$k$$. The Poly1305-AES authenticator on a message $$m_i$$ is

$$ a_i := \bigl(\operatorname{Poly1305}_r(m_i) + \operatorname{AES}_k(i)\bigr) \bmod 2^{128}, $$

where 16-byte strings and integers are identified by little-endian encoding. Note that $$r$$ is reused between messages.

Without knowledge of $$(r, k)$$, the adversary has low probability of forging any authenticated messages that the recipient will accept as genuine. Suppose the adversary sees $$C$$ authenticated messages and attempts $$D$$ forgeries, and can distinguish $\operatorname{AES}_k$ from a uniform random permutation with advantage at most $$\delta$$. (Unless AES is broken, $$\delta$$ is very small.) The adversary's chance of success at a single forgery is at most:

$$ \delta + \frac{(1 - C/2^{128})^{-(C + 1)/2} \cdot 8 D \lceil L/16\rceil}{2^{106}}. $$

The message number $$i$$ must never be repeated with the same key $$(r, k)$$. If it is, the adversary can recover a small list of candidates for $$r$$ and $$\operatorname{AES}_k(i)$$, as with the one-time authenticator, and use that to forge messages.

Use in NaCl and ChaCha20-Poly1305
The NaCl crypto_secretbox_xsalsa20poly1305 authenticated cipher uses a message number $$i$$ with the XSalsa20 stream cipher to generate a per-message key stream, the first 32 bytes of which are taken as a one-time Poly1305 key $$(r_i, s_i)$$ and the rest of which is used for encrypting the message. It then uses Poly1305 as a one-time authenticator for the ciphertext of the message. ChaCha20-Poly1305 does the same but with ChaCha instead of XSalsa20.

Security
The security of Poly1305 and its derivatives against forgery follows from its bounded difference probability as a universal hash family: If $$m_1$$ and $$m_2$$ are messages of up to $$L$$ bytes each, and $$d$$ is any 16-byte string interpreted as a little-endian integer, then

$$ \Pr[\operatorname{Poly1305}_r(m_1) - \operatorname{Poly1305}_r(m_2) \equiv d \pmod{2^{128}}] \leq \frac{8\lceil L/16\rceil}{2^{106}}, $$

where $$r$$ is a uniform random Poly1305 key.

This property is sometimes called $$\epsilon$$-almost-Δ-universality over $$\mathbb Z/2^{128}\mathbb Z$$, or $$\epsilon$$-AΔU, where $$\epsilon = 8\lceil L/16\rceil/2^{106}$$ in this case.

Of one-time authenticator
With a one-time authenticator $$a = \bigl(\operatorname{Poly1305}_r(m) + s\bigr) \bmod 2^{128}$$, the adversary's success probability for any forgery attempt $$(a', m')$$ on a message $$m'$$ of up to $$L$$ bytes is:

$$ \begin{align} \Pr[&a' = \operatorname{Poly1305}_r(m') + s   \mathrel\mid a = \operatorname{Poly1305}_r(m) + s] \\ &= \Pr[a' = \operatorname{Poly1305}_r(m') + a - \operatorname{Poly1305}_r(m)] \\ &= \Pr[\operatorname{Poly1305}_r(m') - \operatorname{Poly1305}_r(m) = a' - a] \\ &\leq 8\lceil L/16\rceil/2^{106}. \end{align} $$

Here arithmetic inside the $$\Pr[\cdots]$$ is taken to be in $$\mathbb Z/2^{128}\mathbb Z$$ for simplicity.

Of NaCl and ChaCha20-Poly1305
For NaCl crypto_secretbox_xsalsa20poly1305 and ChaCha20-Poly1305, the adversary's success probability at forgery is the same for each message independently as for a one-time authenticator, plus the adversary's distinguishing advantage $$\delta$$ against XSalsa20 or ChaCha as pseudorandom functions used to generate the per-message key. In other words, the probability that the adversary succeeds at a single forgery after $$D$$ attempts of messages up to $$L$$ bytes is at most:

$$ \delta + \frac{8D \lceil L/16\rceil}{2^{106}}. $$

Of Poly1305-AES
The security of Poly1305-AES against forgery follows from the Carter–Wegman–Shoup structure, which instantiates a Carter–Wegman authenticator with a permutation to generate the per-message pad. If an adversary sees $$C$$ authenticated messages and attempts $$D$$ forgeries of messages of up to $$L$$ bytes, and if the adversary has distinguishing advantage at most $$\delta$$ against AES-128 as a pseudorandom permutation, then the probability the adversary succeeds at any one of the $$D$$ forgeries is at most:

$$ \delta + \frac{(1 - C/2^{128})^{-(C + 1)/2} \cdot 8 D \lceil L/16\rceil}{2^{106}}. $$

"For instance, assuming that messages are packets up to 1024 bytes; that the attacker sees 264 messages authenticated under a Poly1305-AES key; that the attacker attempts a whopping 275 forgeries; and that the attacker cannot break AES with probability above δ; then, with probability at least 0.999999 − δ, all the 275 are rejected"

- Bernstein, Daniel J. (2005)

Speed
Poly1305-AES can be computed at high speed in various CPUs: for an n-byte message, no more than 3.1n + 780 Athlon cycles are needed, for example. The author has released optimized source code for Athlon, Pentium Pro/II/III/M, PowerPC, and UltraSPARC, in addition to non-optimized reference implementations in C and C++ as public domain software.

Implementations
Below is a list of cryptography libraries that support Poly1305:


 * Botan
 * Bouncy Castle
 * Crypto++
 * Libgcrypt
 * libsodium
 * Nettle
 * OpenSSL
 * LibreSSL
 * wolfCrypt
 * GnuTLS
 * mbed TLS
 * MatrixSSL