'64-bit' CPUs have existed in supercomputers since the 1960s and in RISC-based workstations and servers since the early 1990s. In 2003 they were introduced to the (previously 32-bit) mainstream personal computer arena, in the form of the x86-64 and 64-bit PowerPC processor architectures.
A CPU that is 64-bit internally might have external data buses or address buses with a different size, either larger or smaller; the term "64-bit" is often used to describe the size of these buses as well. For instance, many current machines with 32-bit processors use 64-bit buses (e.g. the original Pentium and later CPUs), and may occasionally be referred to as "64-bit" for this reason. Likewise, some 16-bit processors (for instance, the MC68000) were referred to as 16-/32-bit processors as they had 16-bit buses, but had some internal 32-bit capabilities. The term may also refer to the size of an instruction in the computer's instruction set or to any other datum (e.g. 64-bit double-precision floating-point quantities are common). Without further qualification, "64-bit" computer architecture generally has integer registers that are 64 bits wide, which allows it to support (both internally and externally) 64-bit "chunks" of integer data.
Nearly all common general purpose processors (with the notable exception of most ARM and 32-bit MIPS implementations) have integrated floating point hardware, which may or may not use 64-bit registers to hold data for processing. For example, the x86 architecture includes the x87 floating-point instructions which use eight 80-bit registers in a stack configuration; later revisions of x86, also include SSE instructions, which use eight 128-bit wide registers. By contrast, the 64-bit Alpha family of processors defines thirty two 64-bit wide floating point registers in addition to its thirty two 64-bit wide integer registers.
However, by the early 1990s, the continual reductions in the cost of memory led to installations with quantities of RAM approaching 4 GB, and the use of virtual memory spaces exceeding the 4-gigabyte ceiling became desirable for handling certain types of problems. In response, a number of companies began releasing new families of chips with 64-bit architectures, initially for supercomputers and high-end workstation and server machines. 64-bit computing has gradually drifted down to the personal computer desktop, with some models in Apple's Macintosh lines switching to PowerPC 970 processors (termed "G5" by Apple) in 2002 and to 64-bit x86-64 processors in 2003 (with the launch of the AMD Athlon 64), and with x86-64 processors becoming common in high-end PCs.
The emergence of the 64-bit architecture effectively increases the memory ceiling to 264 addresses, equivalent to approximately 17.2 billion gigabytes, 16.8 million terabytes, or 16 exabytes of RAM. To put this in perspective, in the days when 4 MB of main memory was commonplace, the maximum memory ceiling of 232 addresses was about 1,000 times larger than typical memory configurations. Today, when over 2 GB of main memory is common, the ceiling of 264 addresses is about ten billion times larger, i.e. ten million times more headroom than the 232 case.
Most 64-bit microprocessors on the market today have an artificial limit on the amount of memory they can address, because physical constraints make it highly unlikely that one will need support for the full 16.8 million terabyte capacity. For example, the AMD Athlon X2 has a 40-bit address bus and recognizes only 48 bits of the 64-bit virtual address. The newer Barcelona X4 supports a 48-bit of physical address and 48 bits of the 64-bit virtual address.
64-bit processors calculate particular tasks (such as factorials of large figures) twice as fast as working in 32-bit environments (given example is derived from comparison between 32-bit and 64-bit Windows Calculator; noticeable for factorial of say 100 000). This gives a general feeling of theoretical possibilities of 64-bit optimized applications.
One significant exception to this is the AS/400, whose software runs on a virtual ISA, called TIMI (Technology Independent Machine Interface) which is translated to native machine code by low-level software before being executed. The low-level software is all that has to be rewritten to move the entire OS and all software to a new platform, such as when IBM transitioned their line from the older 32/48-bit "IMPI" instruction set to 64-bit PowerPC (IMPI wasn't anything like 32-bit PowerPC, so this was an even bigger transition than from a 32-bit version of an instruction set to a 64-bit version of the same instruction set).
While 64-bit architectures indisputably make working with large data sets in applications such as digital video, scientific computing, and large databases easier, there has been considerable debate as to whether they or their 32-bit compatibility modes will be faster than comparably-priced 32-bit systems for other tasks. In x86-64 architecture (AMD64), the majority of the 32-bit operating systems and applications are able to run smoothly on the 64-bit hardware.
Sun's 64-bit Java virtual machines are slower to start up than their 32-bit virtual machines because Sun has only implemented the "server" JIT compiler (C2) for 64-bit platforms. The "client" JIT compiler (C1), which produces less efficient code but compiles much faster, is unavailable on 64-bit platforms.
It should be noted that speed is not the only factor to consider in a comparison of 32-bit and 64-bit processors. Applications such as multi-tasking, stress testing, and clustering (for high-performance computing), HPC, may be more suited to a 64-bit architecture given the correct deployment. 64-bit clusters have been widely deployed in large organizations such as IBM, HP and Microsoft, for this reason.
The main disadvantage of 64-bit architectures is that relative to 32-bit architectures the same data occupies more space in memory (due to swollen pointers and possibly other types and alignment padding). This increases the memory requirements of a given process and can have implications for efficient processor cache utilization. Maintaining a partial 32-bit model is one way to handle this and is in general reasonably effective. In fact, the highly performance-oriented z/OS operating system takes this approach currently, requiring program code to reside in any number of 32-bit address spaces while data objects can (optionally) reside in 64-bit regions.
Currently, most commercial x86 software is written in 32-bit code, not 64-bit code, so it does not take advantage of the larger 64-bit address space or wider 64-bit registers and data paths on x86 processors, or the additional registers in 64-bit mode. However, users of most RISC platforms, and users of free or open source operating systems have been able to use exclusive 64-bit computing environments for years. Not all such applications require a large address space nor manipulate 64-bit data items, so they wouldn't benefit from the larger address space or wider registers and data paths. The main advantage to 64-bit versions of such applications is the ability to access more registers in the x86-64 architecture.
Because device drivers in operating systems with monolithic kernels, and in many operating systems with hybrid kernels, execute within the operating system kernel, it is possible to run the kernel as a 32-bit process while still supporting 64-bit user processes. This provides the memory and performance benefits of 64-bit for users without breaking binary compatibility with existing 32-bit device drivers, at the cost of some additional overhead within the kernel. This is the mechanism by which Mac OS X enables 64-bit processes while still supporting 32-bit device drivers.
To avoid this mistake in C and C++, the
sizeof operator can be used to determine the size of these primitive types if decisions based on their size need to be made, both at compile- and run-time. Also, the <limits.h> header in the C99 standard, and numeric_limits class in <limits> header in the C++ standard, give more helpful info; sizeof only returns the size in chars. This used to be misleading, because the standards leave the definition of the
CHAR_BIT macro, and therefore the number of bits in a char, to the implementations. However, except for those compilers targeting DSPs, "64 bits == 8 chars of 8 bits each" has become the norm.
One needs to be careful to use the
ptrdiff_t type (in the standard header
<stddef.h>) for the result of subtracting two pointers; too much code incorrectly uses "int" or "long" instead. To represent a pointer (rather than a pointer difference) as an integer, use
uintptr_t where available (it is only defined in C99, but some compilers otherwise conforming to an earlier version of the standard offer it as an extension).
Neither C nor C++ define the length of a pointer, int, or long to be a specific number of bits. C99, however, defines several dedicated integer types with an exact number of bits.
In most programming environments on 32-bit machines, pointers, "int" types, and "long" types are all 32 bits wide.
However, in many programming environments on 64-bit machines, "int" variables are still 32 bits wide, but "long"s and pointers are 64 bits wide. These are described as having an LP64 data model. Another alternative is the ILP64 data model in which all three data types are 64 bits wide, and even SILP64 where "short" variables are also 64 bits wide. However, in most cases the modifications required are relatively minor and straightforward, and many well-written programs can simply be recompiled for the new environment without changes. Another alternative is the LLP64 model, which maintains compatibility with 32-bit code by leaving both int and long as 32-bit. "LL" refers to the "long long" type, which is at least 64 bits on all platforms, including 32-bit environments.
Many 64-bit compilers today use the LP64 model (including Solaris, AIX, HP, Linux, Mac OS X, FreeBSD, and IBM z/OS native compilers). Microsoft's VC++ compiler uses the LLP64 model. The disadvantage of the LP64 model is that storing a long into an int may overflow. On the other hand, casting a pointer to a long will work. In the LLP model, the reverse is true. These are not problems which affect fully standard-compliant code but code is often written with implicit assumptions about the widths of integer types.
Note that a programming model is a choice made on a per-compiler basis, and several can coexist on the same OS. However typically the programming model chosen by the OS API as primary model dominates.
Another consideration is the data model used for drivers. Drivers make up the majority of the operating system code in most modern operating systems (although many may not be loaded when the operating system is running). Many drivers use pointers heavily to manipulate data, and in some cases have to load pointers of a certain size into the hardware they support for DMA. As an example, a driver for a 32-bit PCI device asking the device to DMA data into upper areas of a 64-bit machine's memory could not satisfy requests from the operating system to load data from the device to memory above the 4 gigabyte barrier, because the pointers for those addresses would not fit into the DMA registers of the device. This problem is solved by having the OS take the memory restrictions of the device into account when generating requests to drivers for DMA, or by using an IOMMU.
|Data model||short||int||long||long long||pointers|
Most 64-bit processor architectures can execute code for the 32-bit version of the architecture natively without any performance penalty. This kind of support is commonly called biarch support or more generally multi-arch support.