The dataset includes 975 participants, who completed an online version of the4-arm bandit task in 2014. All participants gave their consent to carry the experiment. The experiment was approved by UCLResearch Ethics Committee(project 4223/001). The dataset is anonymised, and does not include information about the participants identity. The task followed the 4-arm bandit paradigm described in Daw et al. 2006. In this task the participants were asked to choose between four options on multiple trials. On each trial they had to choose an option and were then given information about the reward obtained by their choice. The rewards of each option drifted over time, in a manner also known as restless bandit, forcing the participants to constantly explore the different options to obtain the maximum reward. The rewards followed one of three drift schedules which were predefined, see below.The experiment lasted 150 trials. Participants failing to response within 4 seconds missed the trial and moved to the next one with no reward.