The knapsack problem is a problem in combinatorial optimization. It derives its name from the following maximization problem of the best choice of essentials that can fit into one bag to be carried on a trip. Given a set of items, each with a cost and a value, determine the number of each item to include in a collection so that the total cost is less than a given limit and the total value is as large as possible.
The decision problem form of the knapsack problem is the question "can a value of at least V be achieved without exceeding the cost C?"
The 0-1 knapsack problem restricts the number of each kind of item, , to zero or one.
The bounded knapsack problem restricts the number of each item to a maximum value.
The unbounded knapsack problem places no bounds on the number of each item.
Of particular interest is the special case of the problem with these properties:
Notice that in this special case, the problem is equivalent to this: given a set of integers, does any subset of it add up to exactly C? Or, if negative costs are allowed and C is chosen to be zero, the problem is: given a set of integers, does any subset add up to exactly 0? This special case is called the subset sum problem. In the field of cryptography the term knapsack problem is often used to refer specifically to the subset sum problem.
The knapsack problem is often solved using dynamic programming, though no polynomial-time algorithm is known for the general problem. Both the general knapsack problem and the subset sum problem are NP-hard, and this has led to attempts to use subset sum as the basis for public key cryptography systems, such as Merkle-Hellman. These attempts typically used some group other than the integers. Merkle-Hellman and several similar algorithms were later broken, because the particular subset sum problems they produced were in fact solvable by polynomial-time algorithms.
The decision version of the knapsack problem described above ("can a value of at least V be achieved without exceeding the cost C?") is NP-complete. The subset-sum version of the knapsack problem is commonly known as one of Karp's 21 NP-complete problems.
The knapsack problem is considered one of the easiest NP-complete problems to solve. Indeed empirical complexity is of the order of O( and very large problems can be solved very quickly, e.g. in 2003 the average time required to solve instances with n = 10,000 was below 14 milliseconds using commodity personal computers.
Let the costs be c1, ..., cn and the corresponding values v1, ..., vn. We wish to maximize total value subject to the constraint that total cost is less than or equal to C. Then for each i ≤ C, define A(i) to be the maximum value that can be attained with total cost less than or equal to i. A(C) then is the solution to the problem.
Observe that A(i) has the following properties:
Here the maximum of the empty set is taken to be zero. Tabulating the results from A(0) up through A(C) gives the solution. Since the calculation of each A(i) involves examining n items (all of which have been previously computed), and there are C values of A(i) to calculate, the running time of the dynamic programming solution is thus O(nC). Dividing the cost elements c1, ..., cn by their greatest common divisor and adjusting C accordingly is an obvious optimization - a similar argument applies to the values array v1, ..., vn.
The O(nC) complexity does not contradict the fact that the knapsack problem is NP-complete, since C, unlike n, is not polynomial in the length of the input to the problem. The length of the input to the problem is proportional to the number of bits in C, not to C itself.
A similar dynamic programming solution for the 0-1 knapsack problem also runs in pseudo-polynomial time. As above, let the costs be c1, ..., cn and the corresponding values v1, ..., vn. We wish to maximize total value subject to the constraint that total cost is less than C. Define a recursive function, A(i, j) to be the maximum value that can be attained with cost less than or equal to j using items up to i.
We can define A(i,j) recursively as follows:
The solution can then be found by calculating A(n, C). To do this efficiently we can use a table to store previous computations. This solution will therefore run in O(nC) time and O(nC) space, though with some slight modifications we can reduce the space complexity to O(C).