Leonard Meyer's Emotion and Meaning in Music (1956) is the classic text in music expectation. Meyer's starting point is the belief that the experience of music (as a listener) is derived from one's emotions and feelings about the music, which themselves are a function of relationships within the music itself. Meyer writes that listeners bring with them a vast body of musical experiences that, as one listens to a piece, conditions one's response to that piece as it unfolds. Meyer argued that music's evocative power derives from its capacity to generate, suspend, prolongate, or violate these expectations.
Meyer models listener expectation in two levels. On a perceptual level, Meyer draws on Gestalt psychology to explain how listeners build mental representations of auditory phenomena. Above this raw perceptual level, Meyer argues that learning shapes (and re-shapes) one's expectations over time.
The I-R model includes two primary factors: proximity and direction. Lerdahl (2001) extended the system by developing a tonal pitch space and adding a stability factor (based on Lerdahl's prior work) and a mobility factor.
Margulis's 2005 model further extends the I-R model. First, Margulis added a melodic attraction factor, from some of Lerdahl's work. Second, while the I-R model relies on a single (local) interval to establish an implication (an expectation), Margulis attempts to model intervalic (local) expectation as well as more deeply schematic (global) expectation. For this, Margulis relies on Lerdahl's and Jackendoff's GTTM (1983) to provide a time-span reduction. At each hierarchical level (a different time scale) in the reduction, Margulis applies her model. These separate levels of analysis are combined through averaging, with each level weighted according to values derived from the time-span reduction. Finally, Margulis's model is explicit and realizable, and yields quantitative output. The output--melodic expectation at each time instant--is a single function of time.
Margulis's model describes three distinct types of listener reactions, each derived from listener-experienced tension: