Definitions

# Knapsack problem

The knapsack problem is a problem in combinatorial optimization. It derives its name from the following maximization problem of the best choice of essentials that can fit into one bag to be carried on a trip. Given a set of items, each with a cost and a value, determine the number of each item to include in a collection so that the total cost is less than a given limit and the total value is as large as possible.

A similar problem often appears in business, combinatorics, complexity theory, cryptography and applied mathematics.

The decision problem form of the knapsack problem is the question "can a value of at least V be achieved without exceeding the cost C?"

## Definition

In the following, we have $n$ kinds of items, 1 through $n$. Each item $j$ has a value $p_j$ and a weight $w_j$. The maximum weight that we can carry in the bag is $c$.

The 0-1 knapsack problem restricts the number of each kind of item, $x_j$, to zero or one.

Mathematically the 0-1-knapsack problem can be formulated as:
maximize $sum_\left\{j=1\right\}^n p_j x_j.$
subject to $sum_\left\{j=1\right\}^n w_j x_j le c, quad quad x_j = 0;mbox\left\{or\right\};1, quad j=1,dots,n.$

The bounded knapsack problem restricts the number of each item to a maximum value.

Mathematically the bounded knapsack problem can be formulated as:
maximize $sum_\left\{j=1\right\}^n p_j x_j.$
subject to $sum_\left\{j=1\right\}^n w_j x_j le c, quad quad 0 le x_j le b_j, quad j=1,dots,n.$

The unbounded knapsack problem places no bounds on the number of each item.

Of particular interest is the special case of the problem with these properties:

• It is a decision problem
• It is a 0/1 problem
• For each item, the weight equals the value: $w_j = p_j$.

Notice that in this special case, the problem is equivalent to this: given a set of integers, does any subset of it add up to exactly C? Or, if negative costs are allowed and C is chosen to be zero, the problem is: given a set of integers, does any subset add up to exactly 0? This special case is called the subset sum problem. In the field of cryptography the term knapsack problem is often used to refer specifically to the subset sum problem.

The knapsack problem is often solved using dynamic programming, though no polynomial-time algorithm is known for the general problem. Both the general knapsack problem and the subset sum problem are NP-hard, and this has led to attempts to use subset sum as the basis for public key cryptography systems, such as Merkle-Hellman. These attempts typically used some group other than the integers. Merkle-Hellman and several similar algorithms were later broken, because the particular subset sum problems they produced were in fact solvable by polynomial-time algorithms.

The decision version of the knapsack problem described above ("can a value of at least V be achieved without exceeding the cost C?") is NP-complete. The subset-sum version of the knapsack problem is commonly known as one of Karp's 21 NP-complete problems.

The knapsack problem is considered one of the easiest NP-complete problems to solve. Indeed empirical complexity is of the order of O($\left(log n\right)^2\right)$ and very large problems can be solved very quickly, e.g. in 2003 the average time required to solve instances with n = 10,000 was below 14 milliseconds using commodity personal computers.

## Dynamic programming solution

The knapsack problem can be solved in pseudo-polynomial time using dynamic programming. The following depicts a dynamic programming solution for the unbounded knapsack problem.

Let the costs be c1, ..., cn and the corresponding values v1, ..., vn. We wish to maximize total value subject to the constraint that total cost is less than or equal to C. Then for each iC, define A(i) to be the maximum value that can be attained with total cost less than or equal to i. A(C) then is the solution to the problem.

Observe that A(i) has the following properties:

• A(0) = 0
• A(i) = max { vj + A(icj) | cji }.

Here the maximum of the empty set is taken to be zero. Tabulating the results from A(0) up through A(C) gives the solution. Since the calculation of each A(i) involves examining n items (all of which have been previously computed), and there are C values of A(i) to calculate, the running time of the dynamic programming solution is thus O(nC). Dividing the cost elements c1, ..., cn by their greatest common divisor and adjusting C accordingly is an obvious optimization - a similar argument applies to the values array v1, ..., vn.

The O(nC) complexity does not contradict the fact that the knapsack problem is NP-complete, since C, unlike n, is not polynomial in the length of the input to the problem. The length of the input to the problem is proportional to the number of bits in C, not to C itself.

A similar dynamic programming solution for the 0-1 knapsack problem also runs in pseudo-polynomial time. As above, let the costs be c1, ..., cn and the corresponding values v1, ..., vn. We wish to maximize total value subject to the constraint that total cost is less than C. Define a recursive function, A(i, j) to be the maximum value that can be attained with cost less than or equal to j using items up to i.

We can define A(i,j) recursively as follows:

• A(0, j) = 0
• A(i, 0) = 0
• A(i, j) = A(i - 1, j) if ci > j
• A(i, j) = max(A(i - 1, j), vi + A(i - 1, j - ci)) if cij.

The solution can then be found by calculating A(n, C). To do this efficiently we can use a table to store previous computations. This solution will therefore run in O(nC) time and O(nC) space, though with some slight modifications we can reduce the space complexity to O(C).

## Greedy approximation algorithm

George Dantzig (1957) proposed a greedy approximation algorithm to solve the unbounded knapsack problem. His version sorts the essentials in decreasing order of value per unit of weight, $p_j / w_j$. It then proceeds to insert them into the sack, starting with as many as possible of the first element (the greatest) until there is no longer space in the sack for more. Provided that any number of each item is available, if k is the maximum value of items that fit into the sack, then the greedy algorithm is guaranteed to achieve at least a value of k/2. However for the bounded problem, where only a given set of items is available, the algorithm may be very much further from optimal.