Diagram of complexity classes provided that P ≠ NP. The existence of problems outside both P and NP-complete in this case was established by Ladner. [1]
The relationship between the complexity classes P and NP is an unsolved question in theoretical computer science. It is considered to be the most important problem in the field[citation needed] - the Clay Mathematics Institute has offered a $1 million US prize for the first correct proof.[2]
In essence, the question P = NP? asks: if 'yes'-answers to a 'yes'-or-'no'-question can be verified "quickly" (in polynomial time), can the answers themselves also be computed quickly?
Consider, for instance, the subset-sum problem, an example of a problem which is "easy" to verify, but whose answer is believed (but not proven) to be "difficult" to compute. Given a set of integers, does some nonempty subset of them sum to 0? For instance, does a subset of the set Failed to parse (Missing texvc executable; please see math/README to configure.): \{-2,-3,15,14,7,-10\}
add up to 0? The answer "yes, because Failed to parse (Missing texvc executable; please see math/README to configure.): \{-2,-3,-10,15\}
add up to zero", can be quickly verified with a few additions. However, finding such a subset in the first place could take much longer. The information needed to verify a positive answer is also called a certificate. Given the right certificates, "yes" answers to our problem can be verified in polynomial time, so this problem is in NP.
An answer to the P = NP question would determine whether problems like the subset-sum problem are as "easy" to compute as to verify. If it turned out P does not equal NP, it would mean that some NP problems are substantially "harder" to compute than to verify.
The restriction to yes/no problems is unimportant; the resulting problem when more complicated answers are allowed (whether FP = FNP) is equivalent.[3]
Context of the problem
The relation between the complexity classes P and NP is studied in computational complexity theory, the part of the theory of computation dealing with the resources required during computation to solve a given problem. The most common resources are time (how many steps it takes to solve a problem) and space (how much memory it takes to solve a problem).[citation needed]
In such analysis, a model of the computer for which time must be analyzed is required. Typically, such models assume that the computer is deterministic (given the computer's present state and any inputs, there is only one possible action that the computer might take) and sequential (it performs actions one after the other). As of 2008, these assumptions are satisfied by all practical computers yet devised, even those featuring parallel computing.[citation needed]
In this theory, the class P consists of all those decision problems (defined below) that can be solved on a deterministic sequential machine in an amount of time that is polynomial in the size of the input; the class NP consists of all those decision problems whose positive solutions can be verified in polynomial time given the right information, or equivalently, whose solution can be found in polynomial time on a non-deterministic machine.[citation needed] Arguably, the biggest open question in theoretical computer science concerns the relationship between those two classes:
- Is P equal to NP?
In a 2002 poll of 100 researchers, 61 believed the answer is no, 9 believed the answer is yes, 22 were unsure, and 8 believed the question may be independent of the currently accepted axioms, and so impossible to prove or disprove.[4]
Formal definitions for P and NP
Conceptually, a decision problem is a problem that takes as input some string, and outputs "yes" or "no". If there is an algorithm (say a Turing machine, or a Lisp or Pascal program with unbounded memory) which is able to produce the correct answer for any input string of length Failed to parse (Missing texvc executable; please see math/README to configure.): n
in at most Failed to parse (Missing texvc executable; please see math/README to configure.): c \cdot n^k
steps, where Failed to parse (Missing texvc executable; please see math/README to configure.): k
and Failed to parse (Missing texvc executable; please see math/README to configure.): c
are constants independent of the input string, then we say that the problem can be solved in polynomial time and we place it in the class P. Formally, P is defined as the set of all languages which can be decided by a deterministic polynomial-time Turing machine. That is,
P = Failed to parse (Missing texvc executable; please see math/README to configure.): \{ L : L=L(M) \text{ for some deterministic polynomial-time Turing machine } M \}
where Failed to parse (Missing texvc executable; please see math/README to configure.): L(M) = \{ w\in\Sigma^{*}: M \text{ accepts } w \}
and a deterministic polynomial-time Turing machine is a deterministic Turing machine Failed to parse (Missing texvc executable; please see math/README to configure.): M
which satisfies the following two conditions:
- Failed to parse (Missing texvc executable; please see math/README to configure.): M \text{ halts on all input } w
- and
- there exists Failed to parse (Missing texvc executable; please see math/README to configure.): k \in N
such that Failed to parse (Missing texvc executable; please see math/README to configure.): T_{M}(n)\in\;
OFailed to parse (Missing texvc executable; please see math/README to configure.): (n^{k}) ,
-
- where Failed to parse (Missing texvc executable; please see math/README to configure.): T_{M}(n) = max\{ t_{M}(w) : w\in\Sigma^{*}, \left|w\right| = n \}
-
- and Failed to parse (Missing texvc executable; please see math/README to configure.): t_{M}(w) = \text{ number of steps M takes to halt on input } w.
NP can be defined similarly using nondeterministic Turing machines (the traditional way). However, a modern approach to define NP is to use the concept of certificate and verifier. Formally, NP is defined as the set of languages over a finite alphabet that have a verifier that runs in polynomial time, where the notion of "verifier" is defined as follows.
Let Failed to parse (Missing texvc executable; please see math/README to configure.): L
be a language over a finite alphabet, Failed to parse (Missing texvc executable; please see math/README to configure.): \Sigma
.
Failed to parse (Missing texvc executable; please see math/README to configure.): L\in\mathbf{NP}
if, and only if, there exists a binary relation Failed to parse (Missing texvc executable; please see math/README to configure.): R\subset\Sigma^{*}\times\Sigma^{*}
and a positive integer Failed to parse (Missing texvc executable; please see math/README to configure.): k
such that the following two conditions are satisfied:
- For all Failed to parse (Missing texvc executable; please see math/README to configure.): x\in\Sigma^{*}
, Failed to parse (Missing texvc executable; please see math/README to configure.): x\in L \Leftrightarrow\exists y\in\Sigma^{*}
such that Failed to parse (Missing texvc executable; please see math/README to configure.): (x,y)\in R\;
and Failed to parse (Missing texvc executable; please see math/README to configure.): \left|y\right|\in\;
OFailed to parse (Missing texvc executable; please see math/README to configure.): (\left|x\right|^{k})
- and
- the language Failed to parse (Missing texvc executable; please see math/README to configure.): L_{R} = \{ x\# y:(x,y)\in R\}
over Failed to parse (Missing texvc executable; please see math/README to configure.): \Sigma\cup\{\#\}
is decidable by a Turing machine.
A Turing machine that decides Failed to parse (Missing texvc executable; please see math/README to configure.): L_{R}
is called a verifier for Failed to parse (Missing texvc executable; please see math/README to configure.): L
and a Failed to parse (Missing texvc executable; please see math/README to configure.): y
such Failed to parse (Missing texvc executable; please see math/README to configure.): (x,y)\in R
is called a certificate of membership of Failed to parse (Missing texvc executable; please see math/README to configure.): x
in Failed to parse (Missing texvc executable; please see math/README to configure.): L
.
In general, a verifier does not have to be polynomial-time. However, for Failed to parse (Missing texvc executable; please see math/README to configure.): L
to be in NP, there must be a verifier that runs in polynomial time.
Example
Let Failed to parse (Missing texvc executable; please see math/README to configure.): \mathit{COMPOSITE} = \{x\in N:x=pq \;\text{for integers}\; p, q > 1 \}
and Failed to parse (Missing texvc executable; please see math/README to configure.): R = \{(x,y)\in N\times N: 1<y< \sqrt x\; ; \;y\; \text{divides}\; x\}
.
Clearly, the question of whether a given Failed to parse (Missing texvc executable; please see math/README to configure.): x
is a composite is equivalent to the question of whether Failed to parse (Missing texvc executable; please see math/README to configure.): x
is a member of Failed to parse (Missing texvc executable; please see math/README to configure.): \mathit{COMPOSITE}
. It can be shown that Failed to parse (Missing texvc executable; please see math/README to configure.): \mathit{COMPOSITE}\in\mathbf{NP}
by verifying that Failed to parse (Missing texvc executable; please see math/README to configure.): \mathit{COMPOSITE}
satifies the above definition.
Failed to parse (Missing texvc executable; please see math/README to configure.): \mathit{COMPOSITE}
also happens to be in P [5][6].
NP-complete
To attack the P = NP question, the concept of NP-completeness is very useful. Informally, the NP-complete problems are the "toughest" problems in NP in the sense that they are the ones most likely not to be in P. NP-complete problems are those NP-hard problems which are in NP, where NP-hard problems are those to which any problem in NP can be reduced in polynomial time. For instance, the decision problem version of the traveling salesman problem is NP-complete, so any instance of any problem in NP can be transformed mechanically into an instance of the traveling salesman problem, in polynomial time. The traveling salesman problem is one of many such NP-complete problems. If any NP-complete problem is in P, then it would follow that P = NP. Unfortunately, many important problems have been shown to be NP-complete and as of 2008, not a single fast algorithm for any of them is known.[citation needed]
Based on the definition alone, it's not obvious that NP-complete problems exist. A trivial and contrived NP-complete problem can be formulated as: given a description of a Turing machine M guaranteed to halt in polynomial time, does there exist a polynomial-size input that M will accept?[7] It is in NP because, given an input, it is simple to check whether or not M accepts the input by simulating M; it is NP-hard because the verifier for any particular instance of a problem in NP can be encoded as a polynomial-time machine M that takes the solution to be verified as input. Then the question of whether the instance is a yes or no instance is determined by whether a valid input exists.
The first natural problem proven to be NP-complete was the Boolean satisfiability problem. This result
came to be known as Cook–Levin theorem; its
proof that satisfiability is NP-complete contains technical details about Turing machines as they relate to the definition of NP. However, after this problem was proved to be NP-complete, proof by reduction provided a simpler way to show that many other problems are in this class. Thus, a vast class of seemingly unrelated problems are all reducible to one another, and are in a sense the "same problem".[citation needed]
Formal definition for NP-completeness
Although there are many equivalent ways of describing NP-completeness, in the context of the P vs NP question, it is best to define NP-complete problems in terms of NP problems.[citation needed]
Let Failed to parse (Missing texvc executable; please see math/README to configure.): \ L
be a language over a finite alphabet Failed to parse (Missing texvc executable; please see math/README to configure.): \ \Sigma
.
Failed to parse (Missing texvc executable; please see math/README to configure.): \ L
is NP-complete if, and only if, the following two conditions are satisfied:
- Failed to parse (Missing texvc executable; please see math/README to configure.): L\in\mathbf{NP}
- and
- any Failed to parse (Missing texvc executable; please see math/README to configure.): L^{'}\in\mathbf{NP}
is polynomial time reducible to Failed to parse (Missing texvc executable; please see math/README to configure.): \ L
(written as Failed to parse (Missing texvc executable; please see math/README to configure.): L^{'}\leq_{p} L
), where Failed to parse (Missing texvc executable; please see math/README to configure.): L^{'}\leq_{p} L
if, and only if, the following two conditions are satisfied:
-
- There exists Failed to parse (Missing texvc executable; please see math/README to configure.): f : \Sigma^{*}\rightarrow\Sigma^{*}
such that Failed to parse (Missing texvc executable; please see math/README to configure.): \forall w\in\Sigma^{*}(w\in L^{'}\Leftrightarrow f(w)\in L)
- and
-
- there exists a polynomial time Turing machine which halts with Failed to parse (Missing texvc executable; please see math/README to configure.): \ f(w)
on its tape on any input Failed to parse (Missing texvc executable; please see math/README to configure.): \ w
.
Still harder problems
Although it is unknown whether P = NP, problems outside of P are known. A number of succinct problems, that is, problems which operate not on normal input but on a computational description of the input, are known to be EXPTIME-complete. Because it can be shown that P Failed to parse (Missing texvc executable; please see math/README to configure.): \subsetneq
EXPTIME, these problems are outside P, and so require more than polynomial time. In fact, by the time hierarchy theorem, they cannot be solved in significantly less than exponential time.[citation needed]
The problem of deciding the truth of a statement in Presburger arithmetic requires even more time. Fischer and Rabin proved in 1974 that every algorithm which decides the truth of Presburger statements has a runtime of at least Failed to parse (Missing texvc executable; please see math/README to configure.): 2^{2^{cn}}
for some constant c. Here, n is the length of the Presburger statement. Hence, the problem is known to need more than exponential run time. Even more difficult are the undecidable problems, such as the halting problem. They cannot be completely solved by any algorithm, in the sense that for any particular algorithm there is at least one input for which that algorithm will not produce the right answer, but either produce the wrong answer or run forever without producing an answer.
Is P really practical?
-
All of the above discussion has assumed that P means "easy" and "not in P" means "hard". While this is a common and reasonably accurate assumption in complexity theory, it is not always true in practice, for several reasons:[citation needed]
- It ignores constant factors. A problem that takes time Failed to parse (Missing texvc executable; please see math/README to configure.): 10^{100}n
is in P (it is linear time), but is completely impractical.
- It ignores the size of the exponents. A problem with time Failed to parse (Missing texvc executable; please see math/README to configure.): n^{100}
is in P, yet impractical even for n = 2. Problems have been proven to exist in P that require arbitrarily large exponents (see time hierarchy theorem). A problem with time Failed to parse (Missing texvc executable; please see math/README to configure.): 1.004^{n}
is not in P, yet is practical for Failed to parse (Missing texvc executable; please see math/README to configure.): n
up into the low thousands.
- It only considers worst-case times. There might be a problem where most of the time, it can be solved in time Failed to parse (Missing texvc executable; please see math/README to configure.): n
, but on very rare occasions it takes time Failed to parse (Missing texvc executable; please see math/README to configure.): 2^{n}
. This problem might have an average time that is polynomial, but a worst case time that is exponential, so the problem would not be in P. The simplex algorithm is an example of a practical worst-case exponential algorithm.
- It only considers deterministic solutions. There might be a problem that can be solved quickly if a tiny error probability is acceptable, but is much harder to solve exactly. The problem might not belong to P even though in practice it can be solved quickly. This is a common approach to attack problems in NP not known to be in P (see RP, BPP). Even if P = BPP, as many researchers believe, it is often considerably easier to find probabilistic algorithms.
- New computing models such as quantum computers may be able to quickly solve some problems not known to be in P; though quantum algorithms have not to date solved any NP-hard problem in polynomial time. However, the definition of P and NP are in terms of classical computing models like Turing machines. Therefore, even if a quantum computer algorithm were discovered to efficiently solve an NP-hard problem, we would only have a way of physically solving difficult problems quickly, not a proof that the mathematical classes P and NP are equal.
- Advances in technology may make exponential-time algorithms efficient for practical ranges of problem sizes.
Why do many computer scientists think P ≠ NP?
Most computer scientists believe that P≠NP. A key reason for this belief is that after decades of studying these problems, no one has been able to find a polynomial-time algorithm for any NP-hard problem. These algorithms were sought long before the concept of NP-completeness was even known (Karp's 21 NP-complete problems, among the first found, were all well-known existing problems at the time they were shown to be NP-complete). Furthermore, the result P = NP would imply many other startling results that are currently believed to be false, such as NP = co-NP and P = PH.[citation needed]
It is also intuitively argued that the existence of problems that are hard to solve but for which the solutions are easy to verify matches real-world experience.[citation needed]
On the other hand, some researchers believe that we are overconfident in P ≠ NP and should explore proofs of P = NP as well. For example, in 2002 these statements were made:[4]
The main argument in favour of P≠NP is the total lack of fundamental progress in the area of exhaustive search. This is, in my opinion, a very weak argument. The space of algorithms is very large and we are only at the beginning of its exploration. [. . .] The resolution of Fermat's Last Theorem also shows that very simply [sic] questions may be settled only by very deep theories.
Being attached to a speculation is not a good guide to research planning. One should always try both directions of every problem. Prejudice has caused famous mathematicians to fail to solve famous problems whose solution was opposite to their expectations, even though they had developed all the methods required.
Consequences of proof
One of the reasons the problem attracts so much attention is the consequences of the answer.[citation needed]
A proof of P = NP could have stunning practical consequences, if the proof leads to efficient methods for solving some of the important problems in NP. Various NP-complete problems are fundamental in many fields. There are enormous positive consequences that would follow from rendering tractable many currently mathematically intractable problems. For instance, many problems in operations research are NP-complete, such as some types of integer programming, and the travelling salesman problem, to name two of the most famous examples. Efficient solutions to these problems would have enormous implications for logistics. Many other important problems, such as some problems in Protein structure prediction are also NP-complete;[8] if these problems were solvable efficiently it could spur considerable advances in biology.
But such changes may pale in significance compared to the revolution an efficient method for solving NP-complete problems would cause in mathematics itself. According to Stephen Cook,[9]
...it would transform mathematics by allowing a computer to find a formal proof of any theorem which has a proof of a reasonable length, since formal proofs can easily be recognized in polynomial time. Example problems may well include all of the CMI prize problems.
Research mathematicians spend their careers trying to prove theorems, and some proofs have taken decades or even centuries to find after problems have been stated - for instance, Fermat's Last Theorem took over three centuries to prove. A method that is guaranteed to find proofs to theorems, should one exist of a "reasonable" size, would essentially end this struggle.[citation needed]
A proof that showed that P ≠ NP, while lacking the practical computational benefits of a proof that P = NP, would also represent a massive advance in computational complexity theory and provide guidance for future research. It would allow one to show in a formal way that many common problems cannot be solved efficiently, so that the attention of researchers can be focused on partial solutions or solutions to other problems. Due to widespread belief in P ≠ NP, much of this focusing of research has already taken place.[citation needed]
Results about difficulty of proof
A million-dollar prize and a huge amount of dedicated research with no substantial results suggest that the problem is difficult.[citation needed] There have also been some formal results demonstrating why the problem might be difficult to solve.
One of the most frequently-cited is a result involving oracles. Suppose there is a magical machine called an oracle that can solve only one problem, such as determining if a given number is prime, but can solve it in constant time. Our new question is now, if there is no limit on the number of times that the oracle can be used, are there problems that can be verified in polynomial time that cannot be solved in polynomial time? Depending on the problem that the oracle solves, with certain oracles P = NP, while for other oracles P ≠ NP. The practical consequence of this is that any proof which can be modified to account for the existence of these oracles cannot solve the problem. Unfortunately, most known methods and nearly all classical methods can be modified in such a way (we say they are relativizing).[citation needed]
Furthermore, a 1993 result by Alexander Razborov and Steven Rudich showed that, given a certain credible assumption, proofs that are "natural" in a certain sense cannot solve the P = NP problem (see natural proof). This demonstrated that some of the most seemingly-promising methods of the time were also unlikely to succeed. As more theorems of this kind are proved, a potential proof of the theorem has more and more traps to avoid.[citation needed]
This is actually another reason why NP-complete problems are useful: if a polynomial-time algorithm can be demonstrated for an NP-complete problem, this would solve the P = NP problem in a way which is not excluded by the above results.[citation needed]
Polynomial-time algorithms
No one knows whether polynomial-time algorithms exist for NP-complete languages. But if such algorithms do exist, some of them are already known.[citation needed] For example, the following algorithm (due to Levin) correctly accepts an NP-complete language, but as of 2008, it is unknown how long it takes in general.
// Algorithm that accepts the NP-complete language SUBSET-SUM.
//
// This is a polynomial-time algorithm if and only if P=NP.
//
// "Polynomial-time" means it returns "yes" in polynomial time when
// the answer should be "yes", and runs forever when it is "no".
//
// Input: S = a finite set of integers
// Output: "yes" if any subset of S adds up to 0.
// Runs forever with no output otherwise.
// Note: "Program number P" is the program obtained by
// writing the integer P in binary, then
// considering that string of bits to be a
// program. Every possible program can be
// generated this way, though most do nothing
// because of syntax errors.
FOR N = 1...infinity
FOR P = 1...N
Run program number P for N steps with input S
IF the program outputs a list of distinct integers
AND the integers are all in S
AND the integers sum to 0
THEN
OUTPUT "yes" and HALT
If, and only if, P = NP, then this is a polynomial-time algorithm accepting an NP-complete language. "Accepting" means it gives "yes" answers in polynomial time, but is allowed to run forever when the answer is "no".[citation needed]
Perhaps we want to "solve" the SUBSET-SUM problem, rather than just "accept" the SUBSET-SUM language. That means we want the algorithm to always halt and return a "yes" or "no" answer. As of 2008, it is unknown whether an algorithm exists that can do this in polynomial time. But if such algorithms do exist, then we already know some of them; for example, the IF statement in the above algorithm can be replaced with this:[citation needed]
IF the program outputs a complete math proof
AND each step of the proof is legal
AND the conclusion is that S does (or does not) have a subset summing to 0
THEN
OUTPUT "yes" (or "no") and HALT
Logical characterizations
The P = NP problem can be restated in terms of the expressibility of certain classes of logical statements. All languages in P can be expressed in first-order logic with the addition of a least fixed point operator and an order relation (effectively, this allows the definition of recursive functions). Similarly, NP is the set of languages expressible in existential second-order logic — that is, second-order logic restricted to exclude universal quantification over relations, functions, and subsets. The languages in the polynomial hierarchy, PH, correspond to all of second-order logic. Thus, the question "is P a proper subset of NP" can be reformulated as "is existential second-order logic able to describe languages that first-order logic with least fixed point cannot?"[citation needed]
See also
References
Further reading
- A. S. Fraenkel and D. Lichtenstein, Computing a perfect strategy for n*n chess requires time exponential in n, Proc. 8th Int. Coll. Automata, Languages, and Programming, Springer LNCS 115 (1981) 278-293 and J. Comb. Th. A 31 (1981) 199-214.
- E. Berlekamp and D. Wolfe, Mathematical Go: Chilling Gets the Last Point, A. K. Peters, 1994. D. Wolfe, Go endgames are hard, MSRI Combinatorial Game Theory Research Worksh., 2000.
- Neil Immerman. Languages Which Capture Complexity Classes. 15th ACM STOC Symposium, pp.347-354. 1983.
- Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein (2001). "Chapter 34: NP-Completeness", Introduction to Algorithms, Second Edition, MIT Press and McGraw-Hill, pp.966–1021. ISBN 0-262-03293-7.
- Christos Papadimitriou (1993). "Chapter 14: On P vs. NP", Computational Complexity, 1st edition, Addison Wesley, pp.329–356. ISBN 0-201-53082-1.
External links
ca:P versus NP
de:P-NP-Problem
es:Clases de complejidad P y NP
fr:Classes de complexité P et NP
ko:P-NP 문제
it:Classi di complessità P e NP
he:P=NP
ja:P≠NP予想
pt:P versus NP
ru:Равенство классов P и NP
fi:P=NP
sv:P=NP?
th:กลุ่มความซับซ้อน พี และ เอ็นพี
tr:P ile NP arasındaki ilişki
|
|