Representative design refers to the idea that experimental stimuli should be sampled or designed such that they represent the environments to which measured constructs are supposed to generalize. In this article we investigate the role of representative design in achieving valid and reliable psychological assessments, by focusing on a widely used behavioral measure of risk taking-the Balloon Analogue Risk Task (BART). Specifically, we demonstrate that the typical implementation of this task violates the principle of representative design, thus conflicting with the...
Representative design in psychological assessment: A case study using the Balloon Analogue Risk Task (BART).
Modeling changes in probabilistic reinforcement learning during adolescence
In the real world, many relationships between events are uncertain and probabilistic. Uncertainty is also likely to be a more common feature of daily experience for youth because they have less experience to draw from than adults. Some studies suggest probabilistic learning may be inefficient in youths compared to adults, while others suggest it may be more efficient in youths in mid adolescence. Here we used a probabilistic reinforcement learning task to test how youth age 8-17 (N = 187) and adults age 18-30 (N = 110) learn about stable probabilistic...
Preference uncertainty accounts for developmental effects on susceptibility to peer influence in adolescence
Adolescents are prone to social influence from peers, with implications for development, both adaptive and maladaptive. Here, using a computer-based paradigm, we replicate a cross-sectional effect of more susceptibility to peer influence in a large dataset of adolescents 14 to 24 years old. Crucially, we extend this finding by adopting a longitudinal perspective, showing that a within-person susceptibility to social influence decreases over a 1.5 year follow-up time period. Exploiting this longitudinal design, we show that susceptibility to social influences at baseline...
The role of anticipated regret in choosing for others
In everyday life, people sometimes find themselves making decisions on behalf of others, taking risks on another’s behalf, accepting the responsibility for these choices and possibly suffering regret for what they could have done differently. Previous research has extensively studied how people deal with risk when making decisions for others or when being observed by others. Here, we asked whether making decisions for present others is affected by regret avoidance. We studied value-based decision making under uncertainty, manipulating both whether decisions benefited...
Using large-scale experiments and machine learning to discover theories of human decision-making
Predicting and understanding how people make decisions has been a long-standing goal in many fields, with quantitative models of human decision-making informing research in both the social sciences and engineering. We show how progress toward this goal can be accelerated by using large datasets to power machine-learning algorithms that are constrained to produce interpretable psychological theories. Conducting the largest experiment on risky choice to date and analyzing the results using gradient-based optimization of differentiable decision theories implemented through...
Defensive freezing and its relation to approach-avoidance decision-making under threat
Successful responding to acutely threatening situations requires adequate approach-avoidance decisions. However, it is unclear how threat-induced states-like freezing-related bradycardia-impact the weighing of the potential outcomes of such value-based decisions. Insight into the underlying computations is essential, not only to improve our models of decision-making but also to improve interventions for maladaptive decisions, for instance in anxiety patients and first-responders who frequently have to make decisions under acute threat. Forty-two participants made...
Punishment insensitivity in humans is due to failures in instrumental contingency learning
Punishment maximises the probability of our individual survival by reducing behaviours that cause us harm, and also sustains trust and fairness in groups essential for social cohesion. However, some individuals are more sensitive to punishment than others and these differences in punishment sensitivity have been linked to a variety of decision-making deficits and psychopathologies. The mechanisms for why individuals differ in punishment sensitivity are poorly understood, although recent studies of conditioned punishment in rodents highlight a key role for punishment...
Analogous computations in working memory input, output and motor gating: Electrophysiological and computational modeling evidence
Adaptive cognitive-control involves a hierarchical cortico-striatal gating system that supports selective updating, maintenance, and retrieval of useful cognitive and motor information. Here, we developed a task that independently manipulates selective gating operations into working-memory (input gating), from working-memory (output gating), and of responses (motor gating) and tested the neural dynamics and computational principles that support them. Increases in gating demands, captured by gate switches, were expressed by distinct EEG correlates at each gating level...
Novelty is not surprise: Human exploratory and adaptive behavior in sequential decision-making
Classic reinforcement learning (RL) theories cannot explain human behavior in the absence of external reward or when the environment changes. Here, we employ a deep sequential decision-making paradigm with sparse reward and abrupt environmental changes. To explain the behavior of human participants in these environments, we show that RL theories need to include surprise and novelty, each with a distinct role. While novelty drives exploration before the first encounter of a reward, surprise increases the rate of learning of a world-model as well as of model-free...
A unified online test battery for cognitive impulsivity reveals relationships with real-world impulsive behaviours
Impulsive behaviours are a major contributor to the global burden of disease, but existing measures of cognitive impulsivity have suboptimal reliability and validity. Here, we introduce the Cognitive Impulsivity Suite, comprising three computerized/online tasks using a gamified interface. We conceptualize rapid-response impulsive behaviours (disinhibition) as arising from the failure of three distinct cognitive mechanisms: attentional control, information gathering and monitoring/shifting. We demonstrate the construct and criterion validity of the Cognitive...
Temporal discounting when outcomes are experienced in the moment: Validation of a novel paradigm and comparison with a classic hypothetical intertemporal choice task
When faced with intertemporal choices, people typically devalue rewards available in the future compared to rewards more immediately available, a phenomenon known as temporal discounting. Decisions involving intertemporal choices arise daily, with critical impact on health and financial wellbeing. Although many such decisions are “experiential” in that they involve delays and rewards that are experienced in real-time and can inform subsequent choices, most studies have focused on intertemporal choices with hypothetical outcomes (or outcomes delivered after all decisions...
Moral labels increase cooperation and costly punishment in a Prisoner's Dilemma game with punishment option
To determine the role of moral norms in cooperation and punishment, we examined the effects of a moral-framing manipulation in a Prisoner’s Dilemma game with a costly punishment option. In each round of the game, participants decided whether to cooperate or to defect. The Prisoner’s Dilemma game was identical for all participants with the exception that the behavioral options were paired with moral labels (“I cooperate” and “I cheat”) in the moral-framing condition and with neutral labels (“A” and “B”) in the neutral-framing condition. After each round of the Prisoner’s...
Adaptation to recent outcomes attenuates the lasting effect of initial experience on risky decisions
Both primarily and recently encountered information have been shown to influence experience-based risky decision making. The primacy effect predicts that initial experience will influence later choices even if outcome probabilities change and reward is ultimately more or less sparse than primarily experienced. However, it has not been investigated whether extended initial experience would induce a more profound primacy effect upon risky choices than brief experience. Therefore, the present study tested in two experiments whether young adults adjusted their risk-taking...
Computational phenotyping of brain-behavior dynamics underlying approach-avoidance conflict in major depressive disorder
Adaptive behavior requires balancing approach and avoidance based on the rewarding and aversive consequences of actions. Imbalances in this evaluation are thought to characterize mood disorders such as major depressive disorder (MDD). We present a novel application of the drift diffusion model (DDM) suited to quantify how offers of reward and aversiveness, and neural correlates thereof, are dynamically integrated to form decisions, and how such processes are altered in MDD. Hierarchical parameter estimation from the DDM demonstrated that the MDD group differed in three...
Effects of subclinical depression on prefrontal-striatal model-based and model-free learning
Depression is characterized by deficits in the reinforcement learning (RL) process. Although many computational and neural studies have extended our knowledge of the impact of depression on RL, most focus on habitual control (model-free RL), yielding a relatively poor understanding of goal-directed control (model-based RL) and arbitration control to find a balance between the two. We investigated the effects of subclinical depression on model-based and model-free learning in the prefrontal-striatal circuitry. First, we found that subclinical depression is associated...
Aging increases prosocial motivation for effort
Social cohesion relies on prosociality in increasingly aging populations. Helping other people requires effort, yet how willing people are to exert effort to benefit themselves and others, and whether such behaviors shift across the life span, is poorly understood. Using computational modeling, we tested the willingness of 95 younger adults (18-36 years old) and 92 older adults (55-84 years old) to put physical effort into self- and other-benefiting acts. Participants chose whether to work and exert force (30%-70% of maximum grip strength) for rewards (2-10 credits)...
Encoding context determines risky choice
Both memory and choice are influenced by context: Memory is enhanced when encoding and retrieval contexts match, and choice is swayed by available options. Here, we assessed how context influences risky choice in an experience-based task in two main experiments (119 and 98 participants retained, respectively) and two additional experiments reported in the Supplemental Material available online (152 and 106 participants retained, respectively). Within a single session, we created two separate contexts by presenting blocks of trials in distinct backgrounds. Risky choices...
Information Seeking on the Horizons Task Does Not Predict Anxious Symptomatology
Excessive information seeking, or exploratory behavior to minimize the uncertainty of unknown options, is a feature of anxiety disorders. The horizons task (Wilson et al. 2014) is a popular task for measuring information-seeking behavior, recently used to identify under-exploration in psychosis (Waltz et al. 2020). The horizons task has not yet been evaluated as a tool for measuring information seeking behavior in anxious individuals. We recruited 100 participants to complete an online version of the horizons task. Anxiety was measured with the Penn State Worry...
Open database for inhibition tasks
Database of cognitive control task data (e.g., Stroop, Flanker tasks).
Interactive effects of incentive value and valence on the performance of discrete action sequences
Incentives can be used to increase motivation, leading to better learning and performance on skilled motor tasks. Prior work has shown that monetary punishments enhance on-line performance while equivalent monetary rewards enhance off-line skill retention. However, a large body of literature on loss aversion has shown that losses are treated as larger than equivalent gains. The divergence between the effects of punishments and reward on motor learning could be due to perceived differences in incentive value rather than valence per se. We test this hypothesis by...
Thalamocortical excitability modulation guides human perception under uncertainty
Knowledge about the relevance of environmental features can guide stimulus processing. However, it remains unclear how processing is adjusted when feature relevance is uncertain. We hypothesized that (a) heightened uncertainty would shift cortical networks from a rhythmic, selective processing-oriented state toward an asynchronous (“excited”) state that boosts sensitivity to all stimulus features, and that (b) the thalamus provides a subcortical nexus for such uncertainty-related shifts. Here, we had young adults attend to varying numbers of task-relevant features...
Interacting with volatile environments stabilizes hidden-state inference and its brain signatures
Making accurate decisions in uncertain environments requires identifying the generative cause of sensory cues, but also the expected outcomes of possible actions. Although both cognitive processes can be formalized as Bayesian inference, they are commonly studied using different experimental frameworks, making their formal comparison difficult. Here, by framing a reversal learning task either as cue-based or outcome-based inference, we found that humans perceive the same volatile environment as more stable when inferring its hidden state by interaction with uncertain...
Two sides of the same coin: Beneficial and detrimental consequences of range adaptation in human reinforcement learning
Evidence suggests that economic values are rescaled as a function of the range of the available options. Although locally adaptive, range adaptation has been shown to lead to suboptimal choices, particularly notable in reinforcement learning (RL) situations when options are extrapolated from their original context to a new one. Range adaptation can be seen as the result of an adaptive coding process aiming at increasing the signal-to-noise ratio. However, this hypothesis leads to a counterintuitive prediction: Decreasing task difficulty should increase range adaptation...
Anxious and obsessive-compulsive traits are independently associated with valuation of noninstrumental information
Aversion to uncertainty about the future has been proposed as a transdiagnostic trait underlying psychiatric diagnoses including obsessive-compulsive disorder and generalized anxiety. This association might explain the frequency of pathological information-seeking behaviors such as compulsive checking and reassurance-seeking in these disorders. Here we tested the behavioral predictions of this model using a noninstrumental information-seeking task that measured preferences for unusable information about future outcomes in different payout domains (gain, loss, and mixed...
Recovering Reliable Idiographic Biological Parameters from Noisy Behavioral Data: the Case of Basal Ganglia Indices in the Probabilistic Selection Task
Behavioral data, despite being a common index of cognitive activity, is under scrutiny for having poor reliability as a result of noise or lacking replications of reliable effects. Here, we argue that cognitive modeling can be used to enhance the test-retest reliability of the behavioral measures by recovering individual-level parameters from behavioral data. We tested this empirically with the Probabilistic Stimulus Selection (PSS) task, which is used to measure a participants sensitivity to positive or negative reinforcement. An analysis of 400,000 simulations from an...
Determining the effects of training duration on the behavioral expression of habitual control in humans: a multi-laboratory investigation
It has been suggested that there are two distinct and parallel mechanisms for controlling instrumental behavior in mammals: goal-directed actions and habits. To gain an understanding of how these two systems interact to control behavior, it is essential to characterize the mechanisms by which the balance between these systems is influenced by experience. Studies in rodents have shown that the amount of training governs the relative expression of these two systems: Behavior is goal-directed following moderate training, but the more extensively an instrumental action is...
Attenuated Directed Exploration during Reinforcement Learning in Gambling Disorder
Gambling disorder (GD) is a behavioral addiction associated with impairments in value-based decision-making and behavioral flexibility and might be linked to changes in the dopamine system. Maximizing long-term rewards requires a flexible trade-off between the exploitation of known options and the exploration of novel options for information gain. This exploration-exploitation trade-off is thought to depend on dopamine neurotransmission. We hypothesized that human gamblers would show a reduction in directed (uncertainty-based) exploration, accompanied by changes in...
Impact of ambient sound on risk perception in humans: neuroeconomic investigations
Research in the field of multisensory perception shows that what we hear can influence what we see in a wide range of perceptual tasks. It is however unknown whether this extends to the visual perception of risk, despite the importance of the question in many applied domains where properly assessing risk is crucial, starting with financial trading. To fill this knowledge gap, we ran interviews with professional traders and conducted three laboratory studies using judgments of financial asset risk as a testbed. We provide evidence that the presence of ambient sound...
Biased evaluations emerge from inferring hidden causes
How do we evaluate a group of people after a few negative experiences with some members but mostly positive experiences otherwise? How do rare experiences influence our overall impression? We show that rare events may be overweighted due to normative inference of the hidden causes that are believed to generate the observed events. We propose a Bayesian inference model that organizes environmental statistics by combining similar events and separating outlying observations. Relying on the models inferred latent causes for group evaluation overweights rare or variable...
Signed and unsigned reward prediction errors dynamically enhance learning and memory
Memory helps guide behavior, but which experiences from the past are prioritized? Classic models of learning posit that events associated with unpredictable outcomes as well as, paradoxically, predictable outcomes, deploy more attention and learning for those events. Here, we test reinforcement learning and subsequent memory for those events, and treat signed and unsigned reward prediction errors (RPEs), experienced at the reward-predictive cue or reward outcome, as drivers of these two seemingly contradictory signals. By fitting reinforcement learning models to...
Variation in the "coefficient of variation": Rethinking the violation of the scalar property in time-duration judgments
The coefficient of variation (CV), also known as relative standard deviation, has been used to measure the constancy of the Weber fraction, a key signature of efficient neural coding in time perception. It has long been debated whether or not duration judgments follow Weber’s law, with arguments based on examinations of the CV. However, what has been largely ignored in this debate is that the observed CVs may be modulated by temporal context and decision uncertainty, thus questioning conclusions based on this measure. Here, we used a temporal reproduction paradigm to...
Expectations of reward and efficacy guide cognitive control allocation
The amount of mental effort we invest in a task is influenced by the reward we can expect if we perform that task well. However, some of the rewards that have the greatest potential for driving these efforts are partly determined by factors beyond one’s control. In such cases, effort has more limited efficacy for obtaining rewards. According to the Expected Value of Control theory, people integrate information about the expected reward and efficacy of task performance to determine the expected value of control, and then adjust their control allocation (i.e., mental...
Dissociation between asymmetric value updating and perseverance in human reinforcement learning
The learning rate is a key parameter in reinforcement learning that determines the extent to which novel information (outcome) is incorporated in guiding subsequent actions. Numerous studies have reported that the magnitude of the learning rate in human reinforcement learning is biased depending on the sign of the reward prediction error. However, this asymmetry can be observed as a statistical bias if the fitted model ignores the choice autocorrelation (perseverance), which is independent of the outcomes. Therefore, to investigate the genuine process underlying human...
Dissociable roles of cortical excitation-inhibition balance during patch-leaving versus value-guided decisions
In a dynamic world, it is essential to decide when to leave an exploited resource. Such patch-leaving decisions involve balancing the cost of moving against the gain expected from the alternative patch. This contrasts with value-guided decisions that typically involve maximizing reward by selecting the current best option. Patterns of neuronal activity pertaining to patch-leaving decisions have been reported in dorsal anterior cingulate cortex (dACC), whereas competition via mutual inhibition in ventromedial prefrontal cortex (vmPFC) is thought to underlie value-guided...
The dynamics of explore-exploit decisions reveal a signal-to-noise mechanism for random exploration
Growing evidence suggests that behavioral variability plays a critical role in how humans manage the tradeoff between exploration and exploitation. In these decisions a little variability can help us to overcome the desire to exploit known rewards by encouraging us to randomly explore something else. Here we investigate how such ‘random exploration’ could be controlled using a drift-diffusion model of the explore-exploit choice. In this model, variability is controlled by either the signal-to-noise ratio with which reward is encoded (the ‘drift rate’), or the amount of...
Modeling the influence of working memory, reinforcement, and action uncertainty on reaction time and choice during instrumental learning
What determines the speed of our decisions? Various models of decision-making have focused on perceptual evidence, past experience, and task complexity as important factors determining the degree of deliberation needed for a decision. Here, we build on a sequential sampling decision-making framework to develop a new model that captures a range of reaction time (RT) effects by accounting for both working memory and instrumental learning processes. The model captures choices and RTs at various stages of learning, and in learning environments with varying complexity....
The ERP, frequency, and time-frequency correlates of feedback processing: Insights from a large sample study
Human learning, at least in part, appears to be dependent on the evaluation of how outcomes of our actions align with our expectations. Over the past 23 years, electroencephalography (EEG) has been used to probe the neural signatures of feedback processing. Seminal work demonstrated a difference in the human event-related potential (ERP) dependent on whether people were processing correct or incorrect feedback. Since then, these feedback evoked ERPs have been associated with reinforcement learning and conflict monitoring, tied to subsequent behavioral adaptations, and...
Multi-task reinforcement learning in humans
The ability to transfer knowledge across tasks and generalize to novel ones is an important hallmark of human intelligence. Yet not much is known about human multitask reinforcement learning. We study participants behaviour in a two-step decision-making task with multiple features and changing reward functions. We compare their behaviour with two algorithms for multitask reinforcement learning, one that maps previous policies and encountered features to new reward functions and one that approximates value functions across tasks, as well as to standard model-based and...
Risky decision and happiness task: The Great Brain Experiment smartphone app
The subjective well-being or happiness of individuals is an important metric for societies. Although happiness is influenced by life circumstances and population demographics such as wealth, we know little about how the cumulative influence of daily life events are aggregated into subjective feelings. Using computational modeling, we show that emotional reactivity in the form of momentary happiness in response to outcomes of a probabilistic reward task is explained not by current task earnings, but by the combined influence of recent reward expectations and prediction...
Three armed bandit gambling task
Healthy control college students. 23 subjects completed the 3-armed bandit task with oscillating probabilities. For example, the ‘blue’ stim would slowly move from 20% reinforcing to 90% then back to 20 over many trials. The other ‘red’ and ‘green’ stims would move similarly, but in different phase. See Fig 1 of the paper. This makes the task great for investigating reward processing & reward prediction error in the service of novel task set generation.
Visual continuity during blinks and alterations in time perception
Eye blinks strongly attenuate visual input, yet we perceive the world as continuous. How this visual continuity is achieved remains a fundamental and unsolved problem. A decrease in luminance sensitivity has been proposed as a mechanism but is insufficient to mask the even larger decrease in luminance because of blinks. Here we put forward a different hypothesis: visual continuity can be achieved through shortening of perceived durations of the sensory consequences of blinks. Here we probed the perceived durations of the blackouts caused by blinks and visual stimuli...
The role of Weber's law in human time perception
Weber’s law predicts that stimulus sensitivity will increase proportionally with increases in stimulus intensity. Does this hold for the stimulus of time - specifically, duration in the milliseconds to seconds range? There is conflicting evidence on the relationship between temporal sensitivity and duration. Weber’s law predicts a linear relationship between sensitivity and duration on interval timing tasks, while two alternative models predict a reverse J-shaped and a U-shaped relationship. Based on previous research, we hypothesised that temporal sensitivity in humans...
State Anxiety Biases Estimates of Uncertainty in Volatile Environments and Impairs Reward Learning
Clinical and subclinical (trait) anxiety impairs decision making and interferes with learning. Less understood are the effects of temporary anxious states on learning and decision making in healthy populations, and whether these can serve as a model for clinical anxiety. Here we test whether anxious states in healthy individuals elicit a pattern of aberrant behavioural, neural, and physiological responses comparable with those found in anxiety disorders-particularly when processing uncertainty in unstable environments. In our study, both a state anxious and a control...
Model based planners reflect on their model-free propensities
Dual-reinforcement learning theory proposes behaviour is under the tutelage of a retrospective, value-caching, model-free (MF) system and a prospective-planning, model-based (MB), system. This architecture raises a question as to the degree to which, when devising a plan, a MB controller takes account of influences from its MF counterpart. We present evidence that such a sophisticated self-reflective MB planner incorporates an anticipation of the influences its own MF-proclivities exerts on the execution of its planned future actions. Using a novel bandit task, wherein...
A sensory integration account for time perception
The connection between stimulus perception and time perception remains unknown. The present study combines human and rat psychophysics with sensory cortical neuronal firing to construct a computational model for the percept of elapsed time embedded within sense of touch. When subjects judged the duration of a vibration applied to the fingertip (human) or whiskers (rat), increasing stimulus intensity led to increasing perceived duration. Symmetrically, increasing vibration duration led to increasing perceived intensity. We modeled real spike trains recorded from...
Impaired adaptation of learning to contingency volatility in internalizing psychopathology
Using a contingency volatility manipulation, we tested the hypothesis that difficulty adapting probabilistic decision-making to second-order uncertainty might reflect a core deficit that cuts across anxiety and depression and holds regardless of whether outcomes are aversive or involve reward gain or loss. We used bifactor modeling of internalizing symptoms to separate symptom variance common to both anxiety and depression from that unique to each. Across two experiments, we modeled performance on a probabilistic decision-making under volatility task using a...
The actions of others act as a pseudo-reward to drive imitation in the context of social reinforcement learning
While there is no doubt that social signals affect human reinforcement learning, there is still no consensus about how this process is computationally implemented. To address this issue, we compared three psychologically plausible hypotheses about the algorithmic implementation of imitation in reinforcement learning. The first hypothesis, decision biasing (DB), postulates that imitation consists in transiently biasing the learners action selection without affecting their value function. According to the second hypothesis, model-based imitation (MB), the learner infers...
Confidence in subjective pain is predicted by reaction time during decision making
Self-report is the gold standard for measuring pain. However, decisions about pain can vary substantially within and between individuals. We measured whether self-reported pain is accompanied by metacognition and variations in confidence, similar to perceptual decision-making in other modalities. Eighty healthy volunteers underwent acute thermal pain and provided pain ratings followed by confidence judgments on continuous visual analogue scales. We investigated whether eye fixations and reaction time during pain rating might serve as implicit markers of confidence....
Acute stress enhances tolerance of uncertainty during decision-making
Acute stress has been shown to influence reward sensitivity, feedback learning, and risk-taking during decision-making, primarily through activation of the hypothalamic pituitary axis (HPA). However, it is unclear how acute stress affects decision-making among choices that vary in their degree of uncertainty. To address this question, we conducted two experiments in which participants repeatedly chose between two options-a high-uncertainty option that offered highly variable rewards but was advantageous in the long-term, and a low-uncertainty option that offered smaller...
Waiting in intertemporal choice tasks affects discounting and subjective time perception
The literature on human delay discounting behavior is dominated by experimental paradigms, which do not impose actual delays. Given that waiting may be aversive even on short timescales, we present a novel delay discounting paradigm to study differences in delay discounting behavior either when real waiting is involved, or not. This paradigm retains the fundamental trade-off between rewards received versus their immediacy. We used hierarchical Bayesian modeling to decompose and test models that separate discounting and subjective time perception mechanisms. We report 2...