Accumulator (cryptography)

In cryptography, an accumulator is a one way membership hash function. It allows users to certify that potential candidates are a member of a certain set without revealing the individual members of the set. This concept was formally introduced by Josh Benaloh and Michael de Mare in 1993.

Formal definitions
There are several formal definitions which have been proposed in the literature. This section lists them by proposer, in roughly chronological order.

Benaloh and de Mare (1993)
Benaloh and de Mare define a one-way hash function as a family of functions $$h_{\ell}: X_{\ell}\times Y_{\ell}\to Z_{\ell}$$ which satisfy the following three properties:


 * 1) For all $$\ell\in\mathbb{Z}, x\in X_{\ell}, y\in Y_{\ell}$$, one can compute $$h_{\ell}(x,y)$$ in time $$\text{poly}(\ell, |x|, |y|)$$. (Here the "poly" symbol refers to an unspecified, but fixed, polynomial.)
 * 2) No probabilistic polynomial-time algorithm will, for sufficiently large $$\ell$$, map the inputs $$\ell\in\mathbb{Z}, (x, y)\in X_{\ell}\times Y_{\ell}, y'\in Y_{\ell}$$, find a value $$x'\in X_{\ell}$$ such that $$h_{\ell}(x, y) = h_{\ell}(x', y')$$ with more than negligible probability.
 * 3) For all $$\ell\in\mathbb{Z}, x\in X_{\ell}, y_1, y_2\in Y_{\ell}$$, one has $$h(h(x, y_1), y_2) = h(h(x, y_2), y_1)$$. (A function that satisfies this property is called quasi-commutative.)

(With the first two properties, one recovers the normal definition of a cryptographic hash function.)

From such a function, one defines the "accumulated hash" of a set $$\{y_1,\dots, y_m \}$$ and starting value $$x$$ w.r.t. a value $$z$$ to be $$h(h(\cdots h(h(x, y_1), y_2),\dots, y_{m-1}), y_m)$$. The result, does not depend on the order of elements $$ y_1,y_2,...,y_n $$because $$h$$ is quasi-commutative.

If $$y_1,y_2,...,y_n$$ belong to some users of a cryptosystem, then everyone can compute the accumulated value $$z.$$ Also, the user of $$ y_i $$ can compute the partial accumulated value $$ z_i$$ of $$( y_1,...,y_{i-1},y_{i+1},...,y_n )$$. Then, $$h(z_i,y_i)=z.$$ So the $$ i -$$user can provide the pair $$ (z_i,y_i) $$ to any other part, in order to authenticate $$ y_i $$.

Barić and Pfitzmann (1997)
The basic functionality of a quasi-commutative hash function is not immediate from the definition. To fix this, Barić and Pfitzmann defined a slightly more general definition, which is the notion of an accumulator scheme as consisting of the following components:


 * 1) Gen: a probabilistic algorithm that takes in two parameters $$\lambda, N$$ (the security parameter and the number of values that can be securely accumulated, respectively), and returns an appropriate key $$k$$.
 * 2) Eval: a probabilistic algorithm that takes in a key $$k$$ and accumulation set $$Y:=\{y_1, \dots, y_{N'}\}$$, where $$N'\leq N$$, and returning an accumulated value $$z$$ and auxiliary information $$aux$$. We insist that Eval must be deterministic for $$z$$.
 * 3) Wit: a probabilistic algorithm that takes in a key $$k$$, a value $$y$$, an accumulated value $$z$$ of some set $$Y$$, and some auxiliary information $$aux$$, and returns either a witness $$w$$ or the special symbol $$\bot$$. We insist that, if $$y\in L$$, that Wit returns a witness, and that Wit otherwise returns $$\bot$$.
 * 4) Ver: a deterministic algorithm that takes in a key $$k$$, a value $$y$$, a witness $$w$$, and an accumulated value $$z$$, and returns a Yes/No value. We insist that if $$w$$ was generated from running Wit on a tuple $$(k, y, z, aux)$$, where $$z, aux$$ were generated from running Eval on some $$k, L$$, and where $$L$$ was chosen arbitrarily and $$k$$ was chosen from running Gen, that Ver always return Yes.

It is relatively easy to see that one can define an accumulator scheme from any quasi-commutative hash function, using the technique shown above.

Camenisch and Lysyanskaya (2002)
One observes that, for many applications, the set of accumulated values will change many times. Naïvely, one could completely redo the accumulator calculation every time; however, this may be inefficient, especially if our set is very large and the change is very small. To formalize this intuition, Camenish and Lysyanskaya defined a dynamic accumulator scheme to consist of the 4 components of an ordinary accumulator scheme, plus three more:


 * 1) Add: a (possibly probabilistic) algorithm that takes in a key $$k$$, an accumulated value $$z$$, and another value to accumulate $$y$$, and returns a new accumulated value $$z'$$ and auxiliary information $$aux$$. We insist that if $$z$$ was generated by accumulating some set $$L$$, then $$z'$$ must be as if it were generated by accumulating the set $$L\cup\{y\}$$.
 * 2) Del: a (possibly probabilistic) algorithm that takes in a key $$k$$, an accumulated value $$z$$, and another value to accumulate $$y$$, and returns a new accumulated value $$z'$$ and auxiliary information $$aux$$. We insist that if $$z$$ was generated by accumulating some set $$L$$, then $$z'$$ must be as if it were generated by accumulating the set $$L\backslash\{y\}$$.
 * 3) Upd: a deterministic algorithm that takes in the key $$k$$, a value $$y$$, a witness $$w$$, the accumulated value $$z$$, and auxiliary information $$aux$$, and returns a new witness $$w'$$. We insist that if $$k$$ was generated by Gen, $$y$$ is part of a set $$L$$, $$w$$ is a witness for $$y$$ being a member of $$L$$, and $$z$$ is an accumulated value for $$L$$, and $$aux$$ was generated by running Add or Del, then $$w'$$ will be a witness for $$y$$ being a member of the new set.

Fazio and Nicolosi note that since Add, Del, and Upd can be simulated by rerunning Eval and Wit, this definition does not add any fundamentally new functionality.

Examples
One example is multiplication over large prime numbers. This is a cryptographic accumulator, since it takes superpolynomial time to factor a composite number (at least according to conjecture), but it takes only a small amount of time (polynomial in size) to divide a prime into an integer to check if it is one of the factors and/or to factor it out. New members may be added or subtracted to the set of factors by multiplying or factoring out the number respectively. In this system, two accumulators that have accumulated a single shared prime can have it trivially discovered by calculating their GCD, even without prior knowledge of the prime (which would otherwise require prime factorization of the accumulator to discover).

More practical accumulators use a quasi-commutative hash function, so that the size of the accumulator does not grow with the number of members. For example, Benaloh and de Mare propose a cryptographic accumulator inspired by RSA: the quasi-commutative function $$h(x, y) := x^y\pmod{n}$$ for some composite number $$n$$. They recommend to choose $$n$$ to be a rigid integer (i.e. the product of two safe primes). Barić and Pfitzmann proposed a variant where $$y$$ was restricted to be prime and at most $$n/4$$ (this constant is very close to $$\phi(n)$$, but does not leak information about the prime factorization of $$n$$).

David Naccache observed in 1993 that $$e_{n, c}(x, y) := x^y c^{y - 1} \pmod{n}$$ is quasi-commutative for all constants $$c, n$$, generalizing the previous RSA-inspired cryptographic accumulator. Naccache also noted that the Dickson polynomials are quasi-commutative in the degree, but it is unknown whether this family of functions is one-way.

In 1996, Nyberg constructed an accumulator which is provably information-theoretically secure in the random oracle model. Choosing some upper limit $$N = 2^d$$ for the number of items that can be securely accumulated and $$\lambda$$ the security parameter, define the constant $$\ell:\approx\frac{e}{\log_2(e)}\lambda N\log_2(N)$$ to be an integer multiple of $$d$$ (so that one can write $$\ell = rd$$) and let $$H:\{0, 1\}^* \to \{0, 1\}^{\ell}$$ be some cryptographically secure hash function. Choose a key $$k$$ as a random $$r$$-bit bitstring. Then, to accumulate using Nyberg's scheme, use the quasi-commutative hash function $$h(x, y):= x\odot\alpha_r(H(y))$$, where $$\odot$$ is the bitwise and operation and $$\alpha_r:\{0, 1\}^{\ell}\to \{0,1\}^r$$ is the function that interprets its input as a sequence of $$d$$-bit bitstrings of length $$r$$, replaces every all-zero bitstring with a single 0 and every other bitstring with a 1, and outputs the result.

Applications
Haber and Stornetta showed in 1990 that accumulators can be used to timestamp documents through cryptographic chaining. (This concept anticipates the modern notion of a cryptographic blockchain.) Benaloh and de Mare proposed an alternative scheme in 1991 based on discretizing time into rounds.

Benaloh and de Mare showed that accumulators can be used so that a large group of people can recognize each other at a later time (which Fazio and Nicolosi call an "ID Escrow" situation). Each person selects a $$y$$ representing their identity, and the group collectively selects a public accumulator $$h$$ and a secret $$x$$. Then, the group publishes or saves the hash function and the accumulated hash of all the group's identities w.r.t the secret $$x$$ and public accumulator; simultaneously, each member of the group keeps both its identity value $$y$$ and the accumulated hash of all the group's identities except that of the member. (If the large group of people do not trust each other, or if the accumulator has a cryptographic trapdoor as in the case of the RSA-inspired accumulator, then they can compute the accumulated hashes by secure multiparty computation.) To verify that a claimed member did indeed belong to the group later, they present their identity and personal accumulated hash (or a zero-knowledge proof thereof); by accumulating the identity of the claimed member and checking it against the accumulated hash of the entire group, anyone can verify a member of the group. With a dynamic accumulator scheme, it is additionally easy to add or remove members afterward.

Cryptographic accumulators can also be used to construct other cryptographically secure data structures:


 * Barić and Pfitzmann show that one can construct fail-stop signatures with only constant space by exploiting the compression property.
 * Goodrich et al. constructed a size-oblivious, efficient, dynamic authenticated dictionary (which allows untrusted directories to give cryptographically verifiable answers to membership queries).
 * Papamanthou et al. constructed a cryptographically secure hash table, whose functionality can be authenticated when stored remotely.

The concept has received renewed interest due to the Zerocoin add on to bitcoin, which employs cryptographic accumulators to eliminate trackable linkage in the bitcoin blockchain, which would make transactions anonymous and more private. More concretely, to mint (create) a Zerocoin, one publishes a coin and a cryptographic commitment to a serial number with a secret random value (which all users will accept as long as it is correctly formatted); to spend (reclaim) a Zerocoin, one publishes the Zerocoin's serial number along with a non-interactive zero-knowledge proof that they know of some published commitment that relates to the claimed serial number, then claims the coin (which all users will accept as long as the NIZKP is valid and the serial number has not appeared before). Since the initial proposal of Zerocoin, it has been succeeded by the Zerocash protocol and is currently being developed into Zcash, a fully-fledged digital currency.