In computer science, a stack machine is a model of computation in which the computer's memory takes the form of one or more stacks. The term also refers to an actual computer implementing or simulating the idealized stack machine.
In addition, a stack machine can also refer to a real or simulated machine with a "0-operand" instruction set. In such a machine, most instructions implicitly operate on values at the top of the stack and replace those values with the result. Typically such machines also have a "load" and a "store" instruction that reads and writes to arbitrary RAM locations. (Like all other instructions, the "load" and "store" instructions in a typical stack machine need no operands -- they always take the RAM address from the top of the stack).
The advantage of stack machines ("0-operand instruction set") over accumulator machines ("1-operand instruction set") and register machines ("2-operand instruction set" or a "3-operand instruction set") is that programs written for a "0-operand" instruction set generally have higher code density than equivalent programs written for other instruction sets.
A stack machine with 1 stack is a very weak model of computation. For example, it can be shown that no 1-stack stack machine can recognize the simple language 0n1n (a number of 0s followed by the same number of 1s), via pumping arguments. The computational power of 1-stack stack machines is strictly greater than that of finite automata, but strictly less than that of deterministic pushdown automata.
A stack machine with multiple stacks, on the other hand, is equivalent to a Turing machine. For example, a 2-stack machine can emulate a TM by using one stack for the tape portion to the left of the TM's current head position and the other stack for the portion to the right.
A machine using processor registers for operands can easily simulate a stack machine. Such a simulation is sometimes called a virtual stack machine. The advantage of a (more or less) stack-based instruction set (in hardware) over a register-based architecture, is shorter instructions, since less operand addresses have to be specified. This is the same as better code density and smaller compiled programs.
Commercial implementations of stack machines generally include a small set of special purpose registers for addressing enclosing contexts, i.e. stack frames that are not the topmost stack frame (dynamic vs lexical scoping are two different ways of using and accessing enclosing contexts). Practical stack machines are thus not identical to the stack machines of automata theory but allows a stack based CPU to be entirely suitable for general purpose computing.
Examples of commercial use of a stack machine include

such as the RTX2000, the RTX2010, the Sh-Boom, the F21
and the PSC1000 
Note that the Burroughs architecture combines a stack machine with tagged memory (a few bits in every memory word to describe the data type of the operands). Tagged memory requires fewer opcodes, e.g., a single "add" instruction works for any combination of integer and floating point operands. Requiring fewer opcodes means that the entire instruction set can fit into smaller opcodes, reducing the total instruction width.