In
psychology,
game theory,
statistics, and
machine learning,
Win-Stay, Lose-Switch (also
Win-Stay, Lose-Shift) is a learning strategy used to model learning in decision situations. It was first invented as an improvement over randomization in
bandit problems. It was later applied to the
prisoner's dilemma in order to model the
evolution of
altruism.
The learning rule bases its decision only on the outcome of the previous play. Outcomes are divided into successes (wins) and failures (loses). If the play on the previous round resulted in a success, then the agent plays the same strategy on the next round. Alternatively, if the play resulted in a failure the agent switches to another action.
References