Ski rental problem

In computer science, the ski rental problem is a name given to a class of problems in which there is a choice between continuing to pay a repeating cost or paying a one-time cost which eliminates or reduces the repeating cost.

The problem
Many online problems have a sub-problem called the rent-or-buy problem. Given an expensive up front cost, or a less expensive repeating cost, with no knowledge of how the future will play out, at what point is it better to pay the up front cost to avoid a continued repeating cost?

Consider a person who decides to go skiing, but for an undecided number of days. Renting skis costs $1 per day, whereas buying a pair of skis costs $10. If the person knows in advance how many days they want to ski, then the breakeven point is 10 days. Fewer than 10 days, renting is preferable, whereas with more than 10 days, buying is preferable. However, with no advance knowledge of how long one will be skiing, the breakeven point is unclear. A good algorithm will minimize the ratio of the cost when the number of days is known in advance to the cost when the number of days is not known in advance. Ski rental is one example of this class of problem.

The break-even algorithm
The break-even algorithm instructs one to rent for 9 days and buy skis on the morning of day 10 if one is still up for skiing. If one has to stop skiing during the first 9 days, it costs the same as what one would pay if one had known the number of days one would go skiing. If one has to stop skiing after day 10, one's cost is $19 which is 90% more than what one would pay if one had known the number of days one would go skiing in advance. This is the worst case for the break-even algorithm.

The break-even algorithm is known to be the best deterministic algorithm for this problem.

The randomized algorithm
A person can flip a coin. If it comes up heads, she buys skis on day eight; otherwise, she buys skis on day 10. This is an instance of a randomized algorithm. The expected cost is at most 80% more than what the person would pay if she had known the number of days she would go skiing, regardless of how many days she skis. In particular, if the person skis for 10 days, her expected cost is 1/2 [7 +10] + 1/2 [9+10] = 18 dollars, only 80% excess instead of 90%.

A randomized algorithm can be understood as a composition of different algorithms, each one which occurs with a given probability. We define the expected competitive ratio on a given instance i as:

$$ E_i = \sum_{j} P(ALG_j) \cdot ALG_j(i) $$, where $$ ALG_j(i) $$ is the competitive ratio for instance i, given $$ ALG_j $$.

Consequently, the competitive ratio of a randomized algorithm is given by the worst value of $$ E_i $$ over all given instances. In the case of the coin flipping ski-rental, we note that the randomized algorithm has 2 possible branches: If the coin comes up heads, we buy on day 8, otherwise we buy on day 10. We may call the branches $$ ALG_{heads} $$ and $$ ALG_{tails} $$, respectively. $$ E_i = P(ALG_{heads}) \cdot ALG_{heads}(i) + P(ALG_{tails}) \cdot ALG_{tails}(i) =\frac12 \cdot1 + \frac12 \cdot1 = 1 $$, for $$ i < 8$$.

$$ E_8 = P(ALG_{heads}) \cdot ALG_{heads}(8) + P(ALG_{tails}) \cdot ALG_{tails}(8) = \frac12 \cdot \frac{17}{8} + \frac12 \cdot 1 = 1.5625 $$,

$$ E_9 = P(ALG_{heads}) \cdot ALG_{heads}(9) + P(ALG_{tails}) \cdot ALG_{tails}(9) = \frac12 \cdot \frac{17}{9} + \frac12 \cdot 1 = 1.444444$$, and

$$ E_i = P(ALG_{heads}) \cdot ALG_{heads}(i) + P(ALG_{tails}) \cdot ALG_{tails}(i) = \frac12 \cdot \frac{17}{10} + \frac12 \cdot \frac{19}{10} = 1.8 $$, for $$ i \geq 10 $$.

Therefore, the competitive ratio of the randomized ski-rental coin flipping algorithm is 1.8.

The best randomized algorithm against an oblivious adversary is to choose some day i at random according to the following distribution p, rent for i-1 days and buy skis on the morning of day i if one are still up for skiing. Karlin et al. first presented this algorithm with distribution $$ p_i = \left \{ \begin{array}{ll} (\frac{b-1}{b})^{b-i} \frac{1}{b(1-(1-(1/b))^b)} & i \leq b \\ 0 & i > b \end{array} \right. , $$ where buying skis costs $$$b$$ and renting costs $1. Its expected cost is at most e/(e-1) $$\approx$$ 1.58 times what one would pay if one had known the number of days one would go skiing. No randomized algorithm can do better.

Applications

 * Snoopy caching: several caches share the same memory space that is partitioned into blocks. When a cache writes to a block, caches that share the block spend 1 bus cycle to get updated. These caches can invalidate the block to avoid the cost of updating. But there is a penalty of p bus cycles for invalidating a block from a cache that shortly thereafter needs access to it. We can break the write request sequences for several caches into request sequences for two caches. One cache performs a sequence of write operations to the block. The other cache needs to decide whether to get updated by paying 1 bus cycle per operation or invalidate the block by paying p bus cycles for future read request of itself. The two cache, one block snoopy caching problem is just the ski rental problem.


 * TCP acknowledgment: A stream of packets arrive at a destination and are required by the TCP protocol to be acknowledged upon arrival. However, we can use a single acknowledgment packet to simultaneously acknowledge multiple outstanding packets, thereby reducing the overhead of the acknowledgments. On the other hand, delaying acknowledgments too much can interfere with the TCP's congestion control mechanisms, and thus we should not allow the latency between a packet's arrival time and the time at which the acknowledgment is sent to increase too much. Karlin et al. described a one-parameter family of inputs, called the basis inputs, and showed that when restricted to these basis inputs, the TCP acknowledgement problem behaves the same as the ski rental problem.


 * Total completion time scheduling: We wish to schedule jobs with fixed processing times on m identical machines. The processing time of job j is pj. Each job becomes known to the scheduler on its release time rj. The goal is to minimize the sum of completion times over all jobs. A simplified problem is one single machine with the following input: at time 0, a job with processing time 1 arrives; k jobs with processing time 0 arrive at some unknown time. We need to choose a start time for the first job. Waiting incurs a cost of 1 per time unit, yet starting the first job before the later k jobs may incur an extra cost of k in the worst case. This simplified problem may be viewed as a continuous version of the ski rental problem.


 * Refactoring versus working with a poor design: In software development, engineers have to choose between the friction and risk of errors of working with an overly-complex design and reducing the complexity of the design before making a change. The extra cost of each change with the old design is the "rental" cost, the cost of refactoring is the "buy" cost. "How many times does one work with a poor design before cleaning it up?" is a ski rental problem.