SMP systems allow any processor to work on any task no matter where the data for that task are located in memory; with proper operating system support, SMP systems can easily move tasks between processors to balance the workload efficiently.
SMP represents one of the earliest styles of multiprocessor machine achitectures, typically used for building smaller computers with up to 8 processors. Larger computer systems might use newer architectures such as NUMA (Non-Uniform Memory Access), which dedicates different memory banks to different processors. In a NUMA architecture, processors may access local memory quickly and remote memory more slowly. This can dramatically improve memory throughput as long as the data is localized to specific processes (and thus processors). On the downside, NUMA makes the cost of moving data from one processor to another, as in workload balancing, more expensive. The benefits of NUMA are limited to particular workloads, notably on servers where the data is often associated strongly with certain tasks or users.
Other systems include asymmetric multiprocessing (ASMP), which uses separate specialized processors for specific tasks, and computer clustered multiprocessing (such as Beowulf), in which not all memory is available to all processors.
Examples of ASMP include many media processor chips that are a relatively slow base processor assisted by a number of hardware accelerator cores. High-powered 3D chipsets in modern videocards could be considered a form of asymmetric multiprocessing. Clustering techniques are used fairly extensively to build very large supercomputers. In this discussion a single processor is denoted as a uni processor (UN).
SMP has many uses in science, industry, and business which often use custom-programmed software for multithreaded processing. However, most consumer products such as word processors and computer games are written in such a manner that they cannot gain large benefits from SMP systems. For games this is usually because writing a program to increase performance on SMP systems can produce a performance loss on uniprocessor systems. Recently, however, multi-core chips have become the norm in new computers, and the balance between installed uni- and multi-core computers may change in the coming years.
The nature of the different programming methods would generally require two separate code-trees to support both uniprocessor and SMP systems with maximum performance. Programs running on SMP systems may experience a performance increase even when they have been written for uniprocessor systems. This is because hardware interrupts that usually suspend program execution while the kernel handles them can run on an idle processor instead. The effect in most applications (e.g. games) is not so much a performance increase as the appearance that the program is running much more smoothly. In some applications, particularly compilers and some distributed computing projects, one will see an improvement by a factor of (nearly) the number of additional processors.
In situations where more than one program runs at the same time, an SMP system will have considerably better performance than a uni-processor because different programs can run on different CPUs simultaneously.
Systems programmers must build support for SMP into the operating system: otherwise, the additional processors remain idle and the system functions as a uniprocessor system.
In cases where an SMP environment processes many jobs, administrators often experience a loss of hardware efficiency. Software programs have been developed to schedule jobs so that the processor utilization reaches its maximum potential. Good software packages can achieve this maximum potential by scheduling each CPU separately, as well as being able to integrate multiple SMP machines and clusters.
Access to RAM is serialized; this and cache coherency issues causes performance to lag slightly behind the number of additional processors in the system.
Before about 2006, entry-level servers and workstations with two processors dominated the SMP market. With the introduction of dual-core devices, SMP is found in most new desktop machines and in many laptop machines. The most popular entry-level SMP systems use the x86 instruction set architecture and are based on Intel’s Xeon, Pentium D, Core Duo, and Core 2 Duo based processors or AMD’s Athlon64 X2, Quad FX or Opteron 200 and 2000 series processors. Servers use those processors and other readily available non-x86 processor choices including the Sun Microsystems UltraSPARC, Fujitsu SPARC64, SGI MIPS, SGI f1200, Intel Itanium, Hewlett Packard PA-RISC, Hewlett-Packard (merged with Compaq which acquired first Digital Equipment Corporation) DEC Alpha, IBM POWER and Apple Computer PowerPC (specifically G4 and G5 series, as well as earlier PowerPC 604 and 604e series) processors. In all cases, these systems are available in uniprocessor versions as well.
Earlier SMP systems used motherboards that have two or more CPU sockets. More recently, microprocessor manufacturers introduced CPU devices with two or more processors in one device, for example, the POWER, UltraSPARC, Opteron, Athlon, Core 2, and Xeon all have multi-core variants. Athlon and Core 2 Duo multiprocessors are socket-compatible with uniprocessor variants, so an expensive dual socket motherboard is no longer needed to implement an entry-level SMP machine. It should also be noted that dual socket Opteron designs are technically ccNUMA designs, though they can be programmed as SMP for a slight loss in performance.
The Burroughs B5500 first implemented SMP in 1961. It was implemented later on other mainframes. Mid-level servers, using between four and eight processors, can be found using the Intel Xeon MP, AMD Opteron 800 and 8000 series and the above-mentioned UltraSPARC, SPARC64, MIPS, Itanium, PA-RISC, Alpha and POWER processors. High-end systems, with sixteen or more processors, are also available with all of the above processors.
Sequent Computer Systems built large SMP machines using Intel 80386 (and later 80486) processors. Some smaller 80486 systems existed, but the major x86 SMP market began with the Intel Pentium technology supporting up to two processors. The Intel Pentium Pro expanded SMP support with up to four processors natively, and systems with as many as two thousand Pentium Pro processors were built. (ASCI Red had 4,510.) Later, special versions of the Intel Pentium II, and Intel Pentium III processors allowed dual CPU systems. In 2001 AMD released their Athlon MP, or MultiProcessor CPU, together with the 760MP motherboard chipset as their first offering in the dual processor marketplace. This was followed by the Intel Pentium II Xeon and Intel Pentium III Xeon processors which could be used with up to four processors in a system natively. Although several much larger systems were built, they were all limited by the physical memory addressing limitation of 64 GiB. With the introduction of 64-bit memory addressing on the AMD64 Opteron in 2003 and EM64T Xeon in 2005, systems are able to address much larger amounts of memory; their addressable limitation of 16 EiB is not expected to be reached in the foreseeable future.