User:MarKroell/sandbox/Complexity of Enumeration Problems

The complexity of an enumeration problem refers to the inherent difficulty of producing the output of an enumeration problem. While decision problems often ask for the existence of a solution to some problem instance, enumeration problems aim at outputting all solutions. To capture the intuition of easy to enumerate problems – despite a possibly exponential number of output values – various notions of tractable and intractable enumeration classes have been proposed over the years.

Enumeration Problem and Algorithm
Let $$\Sigma$$ be a finite alphabet, $$c\in\mathbb{N}$$ be a positive integer and $$\Sigma\subseteq \Sigma^*\times\Sigma^*$$ be a binary relation such that for every $$(x,y)\in R$$ we have $$ |y|\in\mathcal{O}(|x|^c)$$. Given some $$x\in\Sigma^*$$, the set $$R(x)$$ denotes $$\{y\in\Sigma^*\mid (x,y)\in R\}$$. The enumeration problem $$E_R$$ is the function that maps every $$x\in\Sigma^*$$ to the set $$R(x)$$. An enumeration algorithm $$A$$ of $$E_R$$ is an algorithm that, in every input $$x\in\Sigma^*$$, outputs the set $$R(x)$$ without any duplicates.

enumeration problem with order?

Computational Model
The runtime and also the space requirements of an enumeration algorithm may be exponential in the input. Therefore, it is common to use the RAM model (rather than Turing machines) as a computational model, because a RAM can access parts of exponential-size data in polynomial time. An example of an enumeration algorithm that depends on the RAM model is the enumeration of all maximal independent subsets of a graph in lexicographic order.

Measures of Complexity
The resources that are considered when computing the answers to an enumeration problem are space and time. The space complexity of an enumeration problem $$E$$ is given by the total amount of memory that an enumeration algorithm $$A$$ of $$E_R$$ needs to produce all of the answers to $$E_R$$ w.r.t. the size of the input.

The time complexity of an enumeration problem is either measured w.r.t the size of the input (input sensitive measure), or w.r.t. of the output (output sensitive measure).

Input Sensitive Measures
Measuring the complexity of an enumeration problem by the size of the input is equivalent to the usual measure of time complexity for function problems. Given an enumeration problem $$E$$, input $$x\in\Sigma^*$$ and a computable function $$f$$, $$E$$ can be solved within time $$\mathcal{O}(f(|x|)$$ if there is an enumeration algorithm that outputs all of the solutions within this time.

cases where this upper bound is not trivial: all minimal dominating sets

Output Sensitive Measures
Given some enumeration problem $$E$$ over a relation $$R$$ and an input $$x\in\Sigma^*$$, the set of solutions $$R(x)$$ is exponential in general. Moreover, depending on the enumeration problem itself, the size of the set of solutions can have a high fluctuation range, depending on the input. Therefore it is common to not only use the size of the input to measure the complexity of an enumeration problem, but also the output. The following are the most common approaches to measure the complexity of a problem w.r.t. the output:

This measure is used when one is interested in producing the full set of solutions R(x) to a instance x. This measure is used in fields such as biology or chemistry
 * Time w.r.t. the size of the complete output:

When we don't want to wait for all the solutions and the output is large, we might be interested in just a few solutions, but want to have a guarantee at any given time. An enumeration problem is tractable in the context of this measure, if we can output $$i$$ many solutions in time $$\mathcal{O}(|x|+i)$$. This is equivalent to a continuous output of solutions with an increasing delay, where the delay grows linear in the size of the output produced so far. Moreover, this measuring w.r.t. some output produced at any given point is of special interest if the size of the instance $$x$$ is significant larger than the size of a single solution. In this case, a valid measure is the time w.r.t. the previous solution that was output.
 * Time w.r.t. the size of some of the output produced so far:

This is in contrast to the measure w.r.t. size of the output produced so far, where the delay between the output continuously grows.
 * Regularity of the produced solutions:

This measurement is concerned with the size of the total time divided by the number of solutions. (kind of regularity) Uno proposed a general method to obtain constant amortized time algorithms, which can be applied, for instance, to find the matchings or the spanning trees of a graph...
 * Amortized Complexity:

Complexity Classes
First EnumP: This is the class of all NP-relations (introduce the check problem?) On the other hand, enumeration problems restricted to NP-relations enjoy better properties, for instance stability by union. It is also the case that the vast majority of enumeration problems that appear in the literature have a corresponding Check problem which is decidable in polynomial time.

For input sensitvie things, do we have a class?

The usual suspects
Let $$\Sigma$$ be an alphabet. Given an enumeration problem $$E_{R}$$ over relation $$R$$. The following enumeration complexity classes are defined via the existence of an enumeration algorithm $$\mathcal{A}$$, satisfying different properties:

any consecutive solutions as well as the time before outputting the first solution as well as the time between outputting the last solution and termination of $$\mathcal{A}$$ is bound by $$\mathcal{O}(|x|^c)$$.
 * 1) OutputP (Output Polynomial Time): The enumeration problem $$E_{R}$$ is in OutputP (also TotalP), if there exists an enumeration algorithm $$\mathcal{A}$$ and a positive integer $$c$$, such that for every input $$x\in\Sigma^*$$, the algorithm $$\mathcal{A}$$ outputs $$R(x)$$ in time $$\mathcal{O}((|x| + |R(x)|)^c)$$. This class is also called Total Polynomial Time (insert citation).
 * 2) The Inc Hierarchy:
 * 3) SDelayP (Strong polynomial delay): For some problems, the size of the input may be much larger than the size of the generated solutions, which makes polynomial delay an unsatisfactory measure of efficiency. In that case, algorithms whose delay depends on the size of a single solution are naturally more interesting than polynomial delay or output polynomial algorithm
 * 4) DelayP (Polynomial Delay): The enumeration problem $$E_{R}$$ is in DelayP, if there exists an enumeration algorithm $$\mathcal{A}$$ and a positive integer $$c$$, such that for every input $$x\in\Sigma^*$$, the algorithm $$\mathcal{A}$$ outputs the elements $$R(x)$$ with a regular intervals (the so-called delay), such that the delay between
 * 1) DelayClin (Constant Delay after linear preprocessing): Introduced by Durand and Grandjean

Higher Complexity Classes
Hierarchy for higher complexity classes (with motivation?) defined:


 * 1) DelayP Hierarchy
 * 2) IncP Hierarchy

Parameterized Complexity
framework in

Relationship among the classes
Either intersected with EnumP or not

Decision Complexity
the another solution problem, lower bounds (which proves non-membership in outputP) first appears in, also used in

Counting Complexity
Some basic results for constant delay stuff?

Functional Complexity
cite new results as given in (TFNP = FP if and only if IncP = OutputP)

The importance of EnumP?
The restriction of classes to EnumP, even though it appears all throughout the literature, seems kind of arbitrary. there are benefits (see the definition of EnumP), however, there are enumeration problems (both natural and artificial ones) with a Check problem outside of Ptime.

Lower Bounds in Complexity
This is the major open problem here. Despite the multitude of different enumeration classes and hierarchies, there are very few tools to show that a problem is in one class but not in a smaller one. Even though there are some notions for reductions (for fine-grained problems, for problems equivalent to the enumeration of the transversals of a hypergraph, or reductions for hard enumeration problems), complete problems for the lower complexity classes are sorely missing. Therefore, we either need to find a notion of reduction that allows for complete problems in classes that we are interested in, or more generally, a set of tools to show lower bounds.

Transversal Hypergraph
problems equivalent: only in output-subexponential time: currently best approach: