User:Tcshasaposse/Computational complexity theory

Computational complexity theory, as a branch of the theory of computation in computer science, investigates the amount of resources (such as time, memory, randomness) required by algorithms for solving various computational problems.

The origins of computational complexity can be traced back to the early 1970s when it was realized that certain simple problems would take an inordinate amount of time to solve on any computer, even though these problems are in principle solvable. Moreover, the inherent difficulty of these problems does not have anything to do with the computing technology that was available in the 1970s. It appears that nature imposes intrinsic obstacles at performing certain computations, and a central question in complexity theory is to understand why and how these obstacles arise.

Computational problems
Unlike computability theory, which mostly studies problems about computations, much of the focus of computational complexity theory has been on natural problems that are inspired by other scientific disciplines and areas of life (be it mathematics, economics, physics, or sociology). Complexity theory also studies problems arising from computations, but these are often used just as a stepping stone to say something about problems that scientists and engineers from other disciplines might be interested in.

Depending on the type of question asked, computational problems can be classified as decision problems, search problems, optimization problems, or counting problems.

Decision problems
Decision problems can be answered by "yes" or "no". Examples:


 * Perfect Matching: Given a graph, does it contain a perfect matching?


 * Boolean Formula Satisfiability (SAT): Given a boolean formula in conjunctive normal form (CNF), is there an assignment that satisfies the formula?

Search problems
In a search problem we not only want to know if a solution exists, but find a solution as well. Examples:


 * Find a Perfect Matching: Given a graph, find a perfect matching if one exists.


 * Find-SAT: Given a boolean formula, find an assignment that satisfies it, if possible.

Optimization problems
Optimization problems ask for the best possible solution to a problem. A decision or search problem can have several optimization variants. Examples:


 * Maximum matching: Given a graph, find a maximum matching, i.e. the largest perfect matching present in one of its subgraphs.


 * Maximum SAT: Given a boolean formula, find an assignment that satisfies as many of its clauses as possible.


 * Minimum equivalent SAT: Given a boolean formula, find the smallest formula that is equivalent to it, i.e., one that shares the same set of satisfying assignments.

Approximate optimization problems do not ask for the best possible solution, but one that approximates it, for example:


 * 90% Approximate SAT: Given a boolean formula, find an assignment that satisfies 90% of the maximum possible number of clauses that can be satisfied.

Counting problems
Counting problems ask for the number of solutions of a given instance. Examples:


 * #Perfect Matching: Given a graph, count the number of perfect matchings it contains.


 * #SAT: Given a boolean formula, count how many satisfying assignments it has.

Computational devices and resources
To reason rigorously about the complexity of computations we need a model that faithfully captures the resources that we have at our disposal. This appears somewhat difficult as the resources we have available change dramatically over time. In the early 1800s the mathematician Gauss was admired for his prowess at performing difficult calculations. In the early 1900 mechanical calculating devices put the best humans to shame, and by the 1950s electronic computers were already doing calculations that nobody thought possible before. Today any ordinary cell phone can do much much more than a 1950s computer. Given such huge disparities among the devices at our disposal, how can we hope to make any sort of general statement about computation?

Computational complexity theory takes the perspective that the difference in resources between various computational devices is insignificant compared to the vast gap between the tractable and the intractable. For problems like matching, it might matter what kind of computational device we use; perhaps Gauss could solve graphs of size 10, the 1950 machines could do size 100 while today’s computers can solve problems of size 10,000 or more. However, for instances of sat the differences are much less dramatic; maybe Gauss could do size 8, while today we can do problems up to size 50 or so. When trying to separate the tractable from the intractable it does not really matter what model of computation we use, as long as it is a reasonable one, in the sense that it can both simulate and be simulated by a realistic computational device in a relatively efficient way.

The computational devices most commonly studied in computational complexity are Turing Machines and boolean circuits.

In some settings these devices may be endowed with additional resources such as randomness, quantumness, or nondeterminism.

Complexity measures
The difficulty of computational problems is expected to grow as the instances of the problem become larger. For this reason the complexity of a problem is usually measured as a function of the instance size. In general, an algorithm may spend a different amount of resources on different instances of a given size. We have the following two general types of complexity measures depending on how these quantities are aggregated.


 * The worst-case complexity of a Turing Machine or boolean circuit is the maximum amount of resources the device spends over all instances of a given size.


 * The average-case complexity of a Turing Machine or boolean circuit is an average of (some function of) the amount of resources the device spends over instances of a given size.

Here are some of the most commonly studied complexity measures.


 * The time complexity of a Turing Machine is the number of steps in which the machine halts on a given instance.


 * The space complexity of a Turing Machine is the number of cells (memory) that are used before the machine halts on a given instance.


 * The size complexity of a boolean circuit is the number of gates present in the circuit.


 * The randomness complexity of a randomized Turing Machine or boolean circuit is the number of random bits that is used by the device during the computation.


 * The parallel complexity of a boolean circuit is maximum number of gates present on a computation path in the circuit.

Models of computation
Computational complexity is concerned with describing the relationships among various models of computation. Some of these models represent realistic computations performed in nature or in engineering, while others represent imaginary processes ...

Complexity classes
A complexity class is the set of all of the computational problems which can be solved using a certain amount of a certain computational resource.

The complexity class P is the set of decision problems that can be solved by a deterministic machine in polynomial time. This class corresponds to an intuitive idea of the problems which can be effectively solved in the worst cases.

The complexity class NP is the set of decision problems that can be solved by a non-deterministic machine in polynomial time. This class contains many problems that people would like to be able to solve effectively, including the Boolean satisfiability problem, the Hamiltonian path problem and the vertex cover problem. All the problems in this class have the property that their solutions can be checked efficiently.

Many complexity classes can be characterized in terms of the mathematical logic needed to express them – this field is called descriptive complexity.

The P = NP question
The question of whether NP is the same set as P (that is whether problems that can be solved in non-deterministic polynomial time can be solved in deterministic polynomial time) is one of the most important open questions in theoretical computer science due to the wide implications a solution would present. If it were true, many important problems would be shown to have "efficient" solutions. These include various types of integer programming in operations research, many problems in logistics, protein structure prediction in biology, and the ability to find formal proofs of pure mathematics theorems efficiently using computers. The P=NP problem is one of the Millennium Prize Problems proposed by the Clay Mathematics Institute the solution of which is a US$1,000,000 prize for the first person to provide a solution.

Questions like this motivate the concepts of hard and complete. A set of problems X is hard for a set of problems Y if every problem in Y can be transformed "easily" into some problem in X that produces the same solution. The definition of "easily" is different in different contexts. In the particular case of P versus NP, the relevant hard set is NP-hard – a set of problems, that are not necessarily in NP themselves, but to which any NP problem can be reduced in polynomial time.

Set X is complete for Y if it is hard for Y, and is also a subset of Y. Thus, the set of NP-complete  problems contains the most "difficult" problems in NP, in the sense that they are the ones most likely not to be in P. Because the problem of P = NP remains unsolved, being able to reduce a problem to a known NP-complete problem would indicate that there is no known polynomial-time solution for it. Similarly, because all NP problems can be reduced to the set, finding an NP-complete problem that can be solved in polynomial time would mean that P = NP.

Incomplete problems in NP
Incomplete problems are problems in NP that are neither NP-complete nor in P. In other words, incomplete problems can neither be solved in polynomial time nor do they belong to the hardest problems of NP. It has been shown that, if P ≠ NP, then there exist NP-incomplete problems.

An important open problem in this context is, whether the graph isomorphism problem is in P, NP-complete, or NP-incomplete. The answer is not known, but there are strong hints that the problem is at least not NP-complete. The graph isomorphism problem asks whether two given graphs are isomorphic.

NP = co-NP
Where co-NP is the set containing the complement problems (i.e. problems with the yes/no answers reversed) of NP problems. It is believed that the two classes are not equal, however it has not yet been proven. It has been shown that if these two complexity classes are not equal, then it follows that no NP-complete problem can be in co-NP and no co-NP-complete problem can be in NP.

Intractability
Problems that are solvable in theory, but cannot be solved in practice, are called intractable. Naive complexity theory assumes problems that lack polynomial-time solutions are intractable for more than the smallest inputs. Problems that are known to be intractable in this sense include those that are EXPTIME-complete. If NP is not the same as P, then the NP-complete problems are also intractable in this sense. What this means "in practice" is open to debate.

To see why exponential-time solutions might be unusable in practice, consider a problem that requires 2n operations to solve (n is the size of the input). For a relatively small input size of n=100, and assuming a computer that can perform 1012 operations per second, a solution would take about 4×1010 years, much longer than the current age of the universe. On the other hand, a problem that requires n15 operations would be in P, yet a solution would also take about 4×1010 years for n=100. And a problem that required 20.0000001*n operations would not be in P, but would be solvable for quite large cases.

Finally, saying that a problem is not in P does not imply that all large cases of the problem are hard or even that most of them are. For example the decision problem in Presburger arithmetic has been shown not to be in P, yet algorithms have been written that solve the problem in reasonable times in most cases.

History
The foundations of computational complexity theory were laid by Andrey Kolmogorov in the 1950s. A notable early discovery was the Karatsuba algorithm in 1960, for the multiplication of two numbers. This algorithm disproved Kolmogorov's 1956 conjecture that the fastest multiplication algorithm must be $$\Omega(n^2)$$, and thus helped launch the study of algorithms in earnest. The field was subsequently expanded by many researchers, including:


 * Manuel Blum, who developed an axiomatic complexity theory based on his Blum axioms
 * Allan Borodin
 * Stephen Cook
 * Michael R. Garey
 * Oded Goldreich
 * Juris Hartmanis
 * David S. Johnson
 * Richard Karp
 * Marek Karpinski
 * Donald Knuth
 * Leonid Levin
 * Christos H. Papadimitriou
 * Alexander Razborov
 * Richard Stearns
 * Leslie Valiant
 * Andrew Yao