Las Vegas algorithm

In computing, a Las Vegas algorithm is a randomized algorithm that always gives correct results; that is, it always produces the correct result or it informs about the failure. However, the runtime of a Las Vegas algorithm differs depending on the input. The usual definition of a Las Vegas algorithm includes the restriction that the expected runtime be finite, where the expectation is carried out over the space of random information, or entropy, used in the algorithm. An alternative definition requires that a Las Vegas algorithm always terminates (is effective), but may output a symbol not part of the solution space to indicate failure in finding a solution. The nature of Las Vegas algorithms makes them suitable in situations where the number of possible solutions is limited, and where verifying the correctness of a candidate solution is relatively easy while finding a solution is complex.

Systematic search methods for computationally hard problems, such as some variants of the Davis–Putnam algorithm for propositional satisfiability (SAT), also utilize non-deterministic decisions, and can thus also be considered Las Vegas algorithms.

History
Las Vegas algorithms were introduced by László Babai in 1979, in the context of the graph isomorphism problem, as a dual to Monte Carlo algorithms. Babai introduced the term "Las Vegas algorithm" alongside an example involving coin flips: the algorithm depends on a series of independent coin flips, and there is a small chance of failure (no result). However, in contrast to Monte Carlo algorithms, the Las Vegas algorithm can guarantee the correctness of any reported result.

Example
As mentioned above, Las Vegas algorithms always return correct results. The code above illustrates this property. A variable k is generated randomly; after k is generated, k is used to index the array A. If this index contains the value 1, then k is returned; otherwise, the algorithm repeats this process until it finds 1. Although this Las Vegas algorithm is guaranteed to find the correct answer, it does not have a fixed runtime; due to the randomization (in line 3 of the above code), it is possible for arbitrarily much time to elapse before the algorithm terminates.

Definition
This section provides the conditions that characterize an algorithm's being of Las Vegas type.

An algorithm A is a Las Vegas algorithm for problem class X, if


 * 1) whenever for a given problem instance x∈X it returns a solution s, s is guaranteed to be a valid solution of x
 * 2) on each given instance x, the run-time of A is a random variable RTA,x

There are three notions of completeness for Las Vegas algorithms:


 * complete Las Vegas algorithms can be guaranteed to solve each solvable problem within run-time t max, where t max is an instance-dependent constant.

Let P(RTA,x ≤ t) denote the probability that A finds a solution for a soluble instance x in time within t, then A is complete exactly if for each x there exists

some t max such that P(RTA,x ≤ tmax) = 1.


 * approximately complete Las Vegas algorithms solve each problem with a probability converging to 1 as the run-time approaches infinity. Thus, A is approximately complete, if for each instance x, limt→∞ P(RTA,x ≤ t) = 1.
 * essentially incomplete Las Vegas algorithms are Las Vegas algorithms that are not approximately complete.

Approximate completeness is primarily of theoretical interest, as the time limits for finding solutions are usually too large to be of practical use.

Application scenarios
Las Vegas algorithms have different criteria for the evaluation based on the problem setting. These criteria are divided into three categories with different time limits since Las Vegas algorithms do not have set time complexity. Here are some possible application scenarios:


 * Type 1: There are no time limits, which means the algorithm runs until it finds the solution.
 * Type 2: There is a time limit tmax for finding the outcome.
 * Type 3: The utility of a solution is determined by the time required to find the solution.

(Type 1 and Type 2 are special cases of Type 3.)

For Type 1 where there is no time limit, the average run-time can represent the run-time behavior. This is not the same case for Type 2.

Here, P(RT ≤ tmax), which is the probability of finding a solution within time, describes its run-time behavior.

In case of Type 3, its run-time behavior can only be represented by the run-time distribution function rtd: R → [0,1] defined as rtd(t) = P(RT ≤ t) or its approximation.

The run-time distribution (RTD) is the distinctive way to describe the run-time behavior of a Las Vegas algorithm.

With this data, we can easily get other criteria such as the mean run-time, standard deviation, median, percentiles, or success probabilities P(RT ≤ t) for arbitrary time-limits t.

Analogy
Las Vegas algorithms arise frequently in search problems. For example, one looking for some information online might search related websites for the desired information. The time complexity thus ranges from getting "lucky" and finding the content immediately, to being "unlucky" and spending large amounts of time. Once the right website is found, then there is no possibility of error.

Randomized quicksort
A simple example is randomized quicksort, where the pivot is chosen randomly, and divides the elements into three partitions: elements less than pivot, elements equal to pivot, and elements greater than pivot. QuickSort always generates the solution, which in this case the sorted array. Unfortunately, the time complexity is not that obvious. It turns out that the runtime depends on which element we pick as a pivot.


 * The worst case Θ(n2) when the pivot is the smallest or the largest element.


 * $$T(n)=T(0)+T(n-1)+\Theta (n)$$
 * $$T(n)=\Theta (1)+T(n-1)+\Theta (n)$$
 * $$T(n)=T(n-1)+\Theta (n)$$
 * $$T(n)=\Theta (n^{2})$$


 * However, through randomization, where the pivot is randomly picked and is exactly a middle value each time, the QuickSort can be done in Θ(nlogn).


 * $$T(n) \leq 2*T(n/2) + \Theta(n)$$
 * $$T(n) = \Theta(n\log(n))$$

The runtime of quicksort depends heavily on how well the pivot is selected. If a value of pivot is either too big or small, the partition will be unbalanced, resulting in a poor runtime efficiency. However, if the value of pivot is near the middle of the array, then the split will be reasonably well balanced, yielding a faster runtime. Since the pivot is randomly picked, the running time will be good most of the time and bad occasionally.

In the case of average case, it is hard to determine since the analysis does not depend on the input distribution but on the random choices that the algorithm makes. The average of quicksort is computed over all possible random choices that the algorithm might make when choosing the pivot.

Although the worst-case runtime is Θ(n2), the average-case runtime is Θ(nlogn). It turns out that the worst-case does not happen often. For large values of n, the runtime is Θ(nlogn) with a high probability.

Note that the probability that the pivot is the middle value element each time is one out of n numbers, which is very rare. However, it is still the same runtime when the split is 10%-90% instead of a 50%–50% because the depth of the recursion tree will still be O(logn) with O(n) times taken each level of recursion.

Randomized greedy algorithm for the Eight queens problem
The eight queens problem is usually solved with a backtracking algorithm. However, a Las Vegas algorithm can be applied; in fact, it is more efficient than backtracking.

Place 8 queens on a chessboard so that no one attacks another. Remember that a queen attacks other pieces on the same row, column and diagonals.

Assume that k rows, 0 ≤ k ≤ 8, are successfully occupied by queens.

If k = 8, then stop with success. Otherwise, proceed to occupy row k + 1.

Calculate all positions on this row not attacked by existing queens. If there are none, then fail. Otherwise, pick one at random, increment k and repeat.

Note that the algorithm simply fails if a queen cannot be placed. But the process can be repeated and every time will generate different arrangement.

Complexity class
The complexity class of decision problems that have Las Vegas algorithms with expected polynomial runtime is ZPP.

It turns out that
 * $$\textsf{ZPP} = \textsf{RP} \cap \textsf{co-RP}$$

which is intimately connected with the way Las Vegas algorithms are sometimes constructed. Namely the class RP consists of all decision problems for which a randomized polynomial-time algorithm exists that always answers correctly when the correct answer is "no", but is allowed to be wrong with a certain probability bounded away from one when the answer is "yes". When such an algorithm exists for both a problem and its complement (with the answers "yes" and "no" swapped), the two algorithms can be run simultaneously and repeatedly: run each for a constant number of steps, taking turns, until one of them returns a definitive answer. This is the standard way to construct a Las Vegas algorithm that runs in expected polynomial time. Note that in general there is no worst case upper bound on the run time of a Las Vegas algorithm.

Optimal Las Vegas algorithm
In order to make a Las Vegas algorithm optimal, the expected run time should be minimized. This can be done by:
 * 1) The Las Vegas algorithm A(x) runs repeatedly for some number t1 steps. If A(x) stops during the run time then A(x) is done; otherwise, repeat the process from the beginning for another t2 steps, and so on.
 * 2) Designing a strategy that is optimal among all strategies for A(x), given the full information about the distribution of TA(x).

The existence of the optimal strategy might be a fascinating theoretical observation. However, it is not practical in real life because it is not easy to find the information of distribution of TA(x). Furthermore, there is no point of running the experiment repeatedly to obtain the information about the distribution since most of the time, the answer is needed only once for any x.

Relation to Monte Carlo algorithms
Las Vegas algorithms can be contrasted with Monte Carlo algorithms, in which the resources used are bounded but the answer may be incorrect with a certain (typically small) probability. A Las Vegas algorithm can be converted into a Monte Carlo algorithm by running it for set time and generating a random answer when it fails to terminate. By an application of Markov's inequality, we can set the bound on the probability that the Las Vegas algorithm would go over the fixed limit.

Here is a table comparing Las Vegas and Monte Carlo algorithms:

If a deterministic way to test for correctness is available, then it is possible to turn a Monte Carlo algorithm into a Las Vegas algorithm. However, it is hard to convert a Monte Carlo algorithm to a Las Vegas algorithm without a way to test the algorithm. On the other hand, changing a Las Vegas algorithm to a Monte Carlo algorithm is easy. This can be done by running a Las Vegas algorithm for a specific period of time given by confidence parameter. If the algorithm finds the solution within the time, then it is success and if not then output can simply be "sorry".

This is an example of Las Vegas and Monte Carlo algorithms for comparison:

Assume that there is an array with the length of even n. Half of the entries in the array are 0's and the remaining half are 1's. The goal here is to find an index that contains a 1.

Since Las Vegas does not end until it finds 1 in the array, it does not gamble with the correctness but run-time. On the other hand, Monte Carlo runs 300 times, which means it is impossible to know that Monte Carlo will find "1" in the array within 300 times of loops until it actually executes the code. It might find the solution or not. Therefore, unlike Las Vegas, Monte Carlo does not gamble with run-time but correctness.