Definitions

Arithmetic logic unit

[ar-ith-met-ik loj-ik]

In computing, an arithmetic logic unit (ALU) is a digital circuit that performs arithmetic and logical operations. The ALU is a fundamental building block of the central processing unit (CPU) of a computer, and even the simplest microprocessors contain one for purposes such as maintaining timers. The processors found inside modern CPUs and graphics processing units (GPUs) have inside them very powerful and very complex ALUs; a single component may contain a number of ALUs.

Mathematician John von Neumann proposed the ALU concept in 1945, when he wrote a report on the foundations for a new computer called the EDVAC.

Early development

In 1946, von Neumann worked with his colleagues in designing a computer for the Princeton Institute of Advanced Studies (IAS). The IAS computer became the prototype for many later computers. In the proposal, von Neumann outlined what he believed would be needed in his machine, including an ALU.

Von Neumann stated that an ALU is a necessity for a computer because it is guaranteed that a computer will have to compute basic mathematical operations, including addition, subtraction, multiplication, and division. He therefore believed it was "reasonable that [the computer] should contain specialized organs for these operations.

Numerical systems

An ALU must process numbers using the same format as the rest of the digital circuit. For modern processors, that almost always is the two's complement binary number representation. Early computers used a wide variety of number systems, including one's complement, sign-magnitude format, and even true decimal systems, with ten tubes per digit.

ALUs for each one of these numeric systems had different designs, and that influenced the current preference for two's complement, as this is the representation that makes it easier for the ALUs to calculate additions and subtractions.

Practical overview

Most of a processor's operations are performed by one or more ALUs. An ALU loads data from input registers, executes, and stores the result into an output register. A Control Unit tells the ALU what operation to perform on the data. Other mechanisms move data between these registers and memory.

Simple operations

Most ALUs can perform the following operations:

Complex operations

An engineer can design an ALU to calculate any operation, however complicated it is; the problem is that the more complex the operation, the more expensive the ALU is, the more space it uses in the processor, and the more power it dissipates, etc.

Therefore, engineers always calculate a compromise, to provide for the processor (or other circuits) an ALU powerful enough to make the processor fast, but yet not so complex as to become prohibitive. Imagine that you need to calculate the square root of a number; the digital engineer will examine the following options to implement this operation:

1. Design an extraordinarily complex ALU that calculates the square root of any number in a single step. This is called calculation in a single clock.
2. Design a very complex ALU that calculates the square root of any number in several steps. But--and here's the trick--the intermediate results go through a series of circuits that are arranged in a line, like a factory production line. That makes the ALU capable of accepting new numbers to calculate even before finished calculating the previous ones. That makes the ALU able to produce numbers as fast as a single-clock ALU, although the results start to flow out of the ALU only after an initial delay. This is called calculation pipeline.
3. Design a complex ALU that calculates the square root through several steps. This is called interactive calculation, and usually relies on control from a complex control unit with built-in microcode.
4. Design a simple ALU in the processor, and sell a separate specialized and costly processor that the customer can install just beside this one, and implements one of the options above. This is called the co-processor.
5. Tell the programmers that there is no co-processor and there is no emulation, so they will have to write their own algorithms to calculate square roots by software. This is performed by software libraries.
6. Emulate the existence of the co-processor, that is, whenever a program attempts to perform the square root calculation, make the processor check if there is a co-processor present and use it if there is one; if there isn't one, interrupt the processing of the program and invoke the operating system to perform the square root calculation through some software algorithm. This is called software emulation.

The options above go from the fastest and most expensive one to the slowest and least expensive one. Therefore, while even the simplest computer can calculate the most complicated formula, the simplest computers will usually take a long time doing that because of the several steps for calculating the formula.

Powerful processors like the Intel Core and AMD64 implement option #1 for several simple operations, #2 for the most common complex operations and #3 for the extremely complex operations. That is possible by the ability of building very complex ALUs in these processors.

Inputs and outputs

The inputs to the ALU are the data to be operated on (called operands) and a code from the control unit indicating which operation to perform. Its output is the result of the computation.

In many designs the ALU also takes or generates as inputs or outputs a set of condition codes from or to a status register. These codes are used to indicate cases such as carry-in or carry-out, overflow, divide-by-zero, etc.

ALUs vs. FPUs

A Floating Point Unit also performs arithmetic operations between two values, but they do so for numbers in floating point representation, which is much more complicated than the two's complement representation used in a typical ALU. In order to do these calculations, a FPU has several complex circuits built-in, including some internal ALUs.

Usually engineers call an ALU the circuit that performs arithmetic operations in integer formats (like two's complement and BCD), while the circuits that calculate on more complex formats like floating point, complex numbers, etc. usually receive a more illustrious name.