The Rescorla-Wagner Model
In the game you just played, two slot machines were presented repeatedly, while their payout rates were changing over time.
Thus,you have to learn the value of each slot machine, in other words the probability that it will pay out.
In this tutorial, we will use a very simple reinforcement learning model,
known as the Rescorla-Wagner model (Rescorla & Wagner 1972).
This is a prediction-error based learning model, in which stimuli acquire value
when there is a mismatch between prediction and outcome:
Update Equation (Rescorla-Wagner)
Vs,t is the value of stimulus s at trial t,which reflects the expectation of a reward
rt-1 is the reward received on trial t-1
α is the learning rate
Thus, the value is updated based on the prediction error, or the difference between the
received reward r and the expectation V.
The learning rate determines how much this prediction error is weighted.