High-multiplicity bin packing

High-multiplicity bin packing is a special case of the bin packing problem, in which the number of different item-sizes is small, while the number of items with each size is large. While the general bin-packing problem is NP-hard, the high-multiplicity setting can be solved in polynomial time, assuming that the number of different sizes is a fixed constant.

Problem definition
The inputs to the problem are positive integers:


 * d - the number of different sizes (also called the dimension of the problem);
 * B - the bin capacity.
 * s1, ..., sd - the sizes. The vector of sizes is denoted by s.
 * n1, ..., nd - the multiplicities; ni is the number of items with size si. The vector of multiplicities is denoted by n.
 * n denotes the total number of items, that is, n = n1+...+nd.
 * V denotes the largest integer appearing in the description of the problem, that is, V = max(s1, ..., sd, n1, ..., nd, B)

The output is a packing - an assignment of the items to bins, such that the total size of items in each bin is at most B, and subject to this, the number of bins is as small as possible.

Example: suppose d=2, s1=30, s2=40, n1=n2=5, B=120. So there are n=10 items with sizes: 30,30,30,30,30,40,40,40,40,40. Then, a possible packing is: {30,30,30,30}, {40,40,40}, {30,40,40}, which uses 3 bins.

Configurations
A configuration is a set of items that can fit into a single bin. It can be represented by a vector of d integers, denoting the multiplicities of the different sizes in the configuration. Formally, for each configuration c we define an integer vector ac=ac,1, ..., ac,d such that ac ≤ n and ac·s ≤ B.

In the above example, one of the configurations is c={30,40,40}, since 1*30+2*40 ≤ 120. Its corresponding vector is ac=(1,2). Other configuration vectors are (4,0), (3,0), (2,0), (2,1), (1,0), (1,1), (1,2), (0,1), (0,2), (0,3). If we had only three items of size 3, then we could not use the (4,0) configuration.

It is possible to present the problem using the configuration linear program: for each configuration c, there is a variable xc, denoting the number of bins in which c is used. The total number of bins used is simply the sum of xc over all configurations, denoted by 1·x. The total number of items used from each size is the sum of the vectors ac · xc over all configurations c. Then, the problem is to minimize 1·x such that the sum of ac · xc, over all configurations c, is at least n, so that all items are packed.

Basic algorithms
Suppose first that all items are large, that is, every si is at least e·B for some fraction e>0. Then, the total number of items in each bin is at most 1/e, so the total number of configuration is at most d1/e. Each configuration appears at most n times. Therefore, there are at most $$ n^{d^{1/e}}$$ combinations to check. For each combination, we have to check d constraints (one for each size), so the run-time is $$d\cdot n^{d^{1/e}}$$, which is polynomial in n when d, e are constant.

The main problem with this algorithm (besides the fact that it works only when the items are large) is that its runtime is polynomial in n, but the length of the input (in binary representation) is linear in log(V), which is of the order of magnitude of log(n).

Run-time polynomial in the input size
Filippi and Agnetis presented an algorithm that finds a solution with at most OPT+d-2 bins in time O(poly(log V)). In particular, for d=2 different sizes, their algorithm finds an optimal solution in time O(log V).

Goemans and Rothvoss presented an algorithm for any fixed d, that finds the optimal solution when all numbers are given in binary encoding. Their algorithm solves the following problem: given two d-dimensional polytopes P and Q, find the minimum number of integer points in P whose sum lies in Q. Their algorithm runs in time $$(\log V)^{2^{O(d)}}$$. Their algorithm can be adapted to other problems, such as Identical-machines scheduling and unrelated-machines scheduling with various constraints.

Rounding a general instance to a high-multiplicity instance
Several approximation algorithms for the general bin-packing problem use the following scheme:


 * Separate the items to "small" (smaller than eB, for some fraction e in (0,1)) and "large" (at least eB).
 * Handle the large items first:
 * Round the item sizes in some way, such that the number of different sizes is at most some constant d.
 * Solve the resulting high-multiplicity problem.
 * Allocate the small items greedily, e.g. with next-fit bin packing. If no new bins are created, then we are done. If new bins are created, this means that all bins (except maybe the last one) are full up to at least (1-e)B. Therefore, the number of bins is at most OPT/(1-e)+1 ≤ (1+2e)OPT+1.

The algorithms differ in how they round the instance.

Linear rounding
Lueker and de-la-Vega and invented the idea of adaptive input rounding. Order the items by their size, and group them into 1/e2 groups of cardinality ne2. In each group, round the sizes upwards to the maximum size in the group. Now, there are only d=1/e2 different sizes. The solution of the rounded instance is feasible for the original instance too, but the number of bins may be larger than necessary. To quantify the loss, consider the instance rounded down to the maximum size in the previous group (the first group is rounded down to 0). The rounded-down instance D is almost equal to the rounded-up instance U, except that in D there are some ne2 zeros while in U there are some ne2 large items instead; but their size is at most B. Therefore, U requires at most ne2 more bins than D. Since D requires fewer bins than the optimum, we get that Bins(U) ≤ OPT + ne2, that is, we have an additive error that can be made as small as we like by choosing e.

If all items are large (of size at least eB), then each bin in OPT contains at most 1/e items (of size at least eB), so OPT must be at least en. Therefore, Bins(U) ≤ (1+e)OPT. After handling the small items, we get at most $$(1+2e)\mathrm{OPT}+1$$.

Geometric rounding
Karmarkar and Karp present a more efficient rounding method which they call geometric rounding (in contrast to the linear rounding of de-la-Vega and Lueker). Based on these innovations, they present an algorithm with run-time polynomial in $$n$$ and $$1/\varepsilon$$. Their algorithm finds a solution with size at most $$\mathrm{OPT} + \mathcal{O}(\log^2(\mathrm{OPT}))$$.

Improvements
This technique was later improved by several authors:


 * Rothvoss presented an algorithm that generates a solution with size at most $$\mathrm{OPT} + O(\log(\mathrm{OPT})\cdot \log\log(\mathrm{OPT}))$$.


 * Hoberg and Rothvoss improved this algorithm to generate a solution with size at most $$\mathrm{OPT} + O(\log(\mathrm{OPT}))$$. The algorithm is randomized, and its running-time is polynomial in the total number of items.