The RescorlaWagner Model
In the game you just played, two slot machines were presented repeatedly, while their payout rates were changing over time.
Thus,you have to learn the value of each slot machine, in other words the probability that it will pay out.
In this tutorial, we will use a very simple reinforcement learning model,
known as the RescorlaWagner model (Rescorla & Wagner 1972).
This is a predictionerror based learning model, in which stimuli acquire value
when there is a mismatch between prediction and outcome:
Update Equation (RescorlaWagner)
Where

V_{s,t} is the value of stimulus s at trial t,which reflects the expectation of a reward

r_{t1} is the reward received on trial t1

α is the learning rate
Thus, the value is updated based on the prediction error, or the difference between the
received reward r and the expectation V.
The learning rate determines how much this prediction error is weighted.
►►►