Humans usually use this algorithm in base 10, computers in base 2 (where multiplying by the single digit of the multiplier reduces to a simple series of logical AND operations). Humans will write down all the products and then add them together; computers (and abacus operators) will sum the products as soon as each one is computed. Many computers do not have a "multiply" instruction -- on those computers, programmers implement the long multiplication algorithm in software, using shift and add instructions. Some chips implement this algorithm for various integer and floating-point sizes in hardware or in microcode.
To multiply two numbers with n digits using this method, one needs about n2 operations. More formally: using a natural size metric of number of digits, the time complexity of multiplying two n-digit numbers using long multiplication is Θ(n2).
When implemented in software, long multiplication algorithms have to deal with overflow during additions, which can be expensive. For this reason, a typical approach is to represent the number in a small base b such that, for example, 8b2 is a representable machine integer (for example Richard Brent used this approach in his Fortran package MP); we can then perform several additions before having to deal with overflow. When the number becomes too large, we add part of it to the result or carry and map the remaining part back to a number less than b; this process is called normalization.
23958233
5830 ×
------------
00000000 (= 23,958,233 × 0)
71874699 (= 23,958,233 × 30)
191665864 (= 23,958,233 × 800)
119791165 (= 23,958,233 × 5,000)
------------
139676498390 (= 139,676,498,390)
Let n be the total number of bits in the two input numbers. Long multiplication has the advantage that it can easily be formulated as a log space algorithm; that is, an algorithm that only needs working space proportional to the logarithm of the number of digits in the input (Θ(log n)). This is the double logarithm of the numbers being multiplied themselves (log log N). We don't include the input or output bits in this measurement, since that would trivially make the space requirement linear; instead we make the input bits read-only and the output bits write-only. (This just means that input and output bits are not counted as we only count read- AND writable bits. )
The method is simple: we add the columns right-to-left, keeping track of the carry as we go. We don't have to store the columns to do this. To show this, let the ith bit from the right of the first and second operands be denoted ai and bi respectively, both starting at i=0, and let ri be the ith bit from the right of the result. Then:
where c is the carry from the previous column. Provided neither c nor the total sum exceed log space, we can implement this formula in log space, since the indexes j and k each have O(log n) bits.
A simple inductive argument shows that the carry can never exceed n and the total sum for ri can never exceed 2n: the carry into the first column is zero, and for all other columns, there are at most n bits in the column, and a carry of at most n coming in from the previous column (by the induction hypothesis). Their sum is at most 2n, and the carry to the next column is at most half of this, or n. Thus both these values can be stored in O(log n) bits.
In pseudocode, the log-space algorithm is:
multiply(a[0..n-1], b[0..n-1]) // Arrays representing the binary representations
x ← 0
for i from 0 to 2n-1
for j from 0 to i
k ← i - j
x ← x + (a[j] × b[k])
result[i] ← x mod 2
x ← floor(x/2)
Lattice, or sieve, multiplication is algorithmically equivalent to long multiplication. It requires the preparation of a lattice (a grid drawn on paper) which guides the calculation and separates all the multiplications from the additions. It was introduced to Europe in 1202 in Fibonacci's Liber Abaci. Leonardo described the operation as mental, using his right and left hands to carry the intermediate calculations. Napier's bones, or Napier's rods also used this method, as published by Napier in 1617, the year of his death.
As shown in the example, the multiplicand and multiplier are written above and to the right of a lattice, or a sieve. It is found in Muhammad ibn Musa al-Khwarizmi's "Arithmetic", one of Leonardo's sources mentioned by Sigler, author of "Fibonacci's Liber Abaci", 2002.
2 3 9 5 8 2 3 3+---+---+---+---+---+---+---+---+-|1 /|1 /|4 /|2 /|4 /|1 /|1 /|1 /|| / | / | / | / | / | / | / | / | 501|/ 0|/ 5|/ 5|/ 5|/ 0|/ 0|/ 5|/ 5|+---+---+---+---+---+---+---+---+-|1 /|2 /|7 /|4 /|6 /|1 /|2 /|2 /|| / | / | / | / | / | / | / | / | 802|/ 6|/ 4|/ 2|/ 0|/ 4|/ 6|/ 4|/ 4|+---+---+---+---+---+---+---+---+-|0 /|0 /|2 /|1 /|2 /|0 /|0 /|0 /|| / | / | / | / | / | / | / | / | 317|/ 6|/ 9|/ 7|/ 5|/ 4|/ 6|/ 9|/ 9|+---+---+---+---+---+---+---+---+-|0 /|0 /|0 /|0 /|0 /|0 /|0 /|0 /|| / | / | / | / | / | / | / | / | 024|/ 0|/ 0|/ 0|/ 0|/ 0|/ 0|/ 0|/ 0|+---+---+---+---+---+---+---+---+-26 15 13 18 17 13 09 00 |
01 002 0017 00024 000026 0000015 00000013 000000018 0000000017 00000000013 000000000009 0000000000000 ============= 139676498390 |
= 139,676,498,390 |
In base 2, long multiplication reduces to a nearly trivial operation. For each '1' bit in the multiplier, shift the multiplicand an appropriate amount and then sum the shifted values. Depending on computer processor architecture and choice of multiplier, it may be faster to code this algorithm using hardware bit shifts and adds rather than depend on multiplication instructions, when the multiplier is fixed and the number of adds required is small.
This algorithm is also known as Peasant multiplication, because it has been widely used among those who are unschooled and thus have not memorized the multiplication tables required by long multiplication. The algorithm was also in use in ancient Egypt.
On paper, write down in one column the numbers you get when you repeatedly halve the multiplier, ignoring the remainder; in a column beside it repeatedly double the multiplicand. Cross out each row in which the last digit of the first number is even, and add the remaining numbers in the second column to obtain the product.
The main advantages of this method are that it can be taught quickly, no memorization is required, and it can be performed using tokens such as poker chips if paper and pencil are not available. It does however take more steps than long multiplication so it can be unwieldy when large numbers are involved.
11 3
5 6
212
1 24
---
33
Describing the steps explicitly:
The method works because multiplication is distributive, so:
A more complicated example, using the figures from the earlier examples (23,958,233 and 5,830):
583023958233
2915 47916466
1457 95832932
728191665864
364383331728
182766663456
91 1533326912
45 3066653824
226133307648
11 12266615296
5 24533230592
249066461184
1 98132922368
------------
139676498390
Bigger numbers x1x2 can be split into two parts x1 and x2. Then the method works analogously. To compute these three products of m-digit numbers, we can employ the same trick again, effectively using recursion. Once the numbers are computed, we need to add them together (step 5.), which takes about n operations.
Karatsuba multiplication has a time complexity of Θ. The number is approximately 1.585, so this method is significantly faster than long multiplication. Because of the overhead of recursion, Karatsuba's multiplication is slower than long multiplication for small values of n; typical implementations therefore switch to long multiplication if n is below some threshold.
Later the Karatsuba method was called ‘divide and conquer’, the other names of this method, used at the present, are ‘binary splitting’ and ‘dichotomy principle’.
The appearance of the method ‘divide and conquer’ was the starting point of the theory of fast multiplications. A number of authors (among them Toom, Cook and Schönhage) continued to look for an algorithm of multiplication with the complexity close to the optimal one, and 1971 saw the construction of the Schönhage-Strassen algorithm, which has the best known (at present) upper bound for M(n).
The Karatsuba ‘divide and conquer’ is the most fundamental and general fast method. Hundreds of different algorithms are constructed on its basis. Among these algorithms the most well known are the algorithms based on Fast Fourier Transform (FFT) and Fast Matrix Multiplication.
Although using more and more parts can reduce the time spent on recursive multiplications further, the overhead from additions and digit management also grows. For this reason, the method of Fourier transforms is typically faster for numbers with several thousand digits, and asymptotically faster for even larger numbers.
Computers can quickly use bit-wise shifts to multiply (shift left) and divide (shift right) by powers of two. A simple extension can be used to multiply by 10 by multiplying by five (with one two-bit-shift and one addition), and then another one-bit-left shift. This can also be used by extension to floating point numbers with proper adjustment of the fraction and mantissa. This technique was used by spreadsheet programs to speed up the formatting of decimal numbers on early personal computers.
We can then say that
by setting bj=0 and ai=0 for j, i > m, k=i+j and {ck} as the convolution of {ai} and {bj}. Using the convolution theorem ab can be computed by
The fastest known method based on this idea was described in 1971 by Schönhage and Strassen (Schönhage-Strassen algorithm) and has a time complexity of Θ(n ln(n) ln(ln(n))).
Applications of this algorithm includes GIMPS.
Using number-theoretic transforms instead of discrete Fourier transforms avoids rounding error problems by using modular arithmetic instead of complex numbers.
Advanced algorithms: