Cognitive control is guided by learning, as people adjust control to meet changing task demands. The two best-studied instances of control-learning are the enhancement of attentional task focus in response to increased frequencies of incongruent distracter stimuli, reflected in the list-wide proportion congruent (LWPC) effect, and the enhancement of switch-readiness in response to increased frequencies of task switches, reflected in the list-wide proportion switch (LWPS) effect. However, the latent architecture underpinning these adaptations in cognitive stability and...
Distinct but correlated latent factors support the regulation of learned conflict-control and task-switching
Age-related differences in visual confidence are driven by individual differences in cognitive control capacities
Visual perception is not only shaped by sensitivity but also by confidence, i.e., the ability to estimate the accuracy of a visual decision. Younger observers have been reported to have access to a reliable measure of their own uncertainty when making visual decisions. This metacognitive ability might be challenged during ageing due to increasing sensory noise and decreasing cognitive control resources. We investigated age effects on visual confidence using a visual contrast discrimination task and a confidence forced-choice paradigm. Younger adults (19-38 years) showed...
Modeling variation in empathic sensitivity using go/no-go social reinforcement learning
Empathic experiences shape social behaviors and display considerable individual variation. Recent advances in computational behavioral modeling can help rigorously quantify individual differences, but remain understudied in the context of empathy and antisocial behavior. We adapted a go/no-go reinforcement learning task across social and non-social contexts such that monetary gains and losses explicitly impacted the subject, a study partner, or no one. Empathy was operationalized as sensitivity to others’ rewards, sensitivity to others’ losses, and as the Pavlovian...
Children's value-based decision making
To effectively navigate their environments, infants and children learn how to recognize events predict salient outcomes, such as rewards or punishments. Relatively little is known about how children acquire this ability to attach value to the stimuli they encounter. Studies often examine childrens ability to learn about rewards and threats using either classical conditioning or behavioral choice paradigms. Here, we assess both approaches and find that they yield different outcomes in terms of which individuals had efficiently learned the value of information presented...
Better, worse, or different than expected: on the role of value and identity prediction errors in fear memory reactivation
Although reconsolidation-based interventions constitute a promising new avenue to treating fear and anxieties disorders, the success of the intervention is not guaranteed. The initiation of memory reconsolidation is dependent on whether a mismatch between the experienced and predicted outcome-a prediction error (PE)-occurs during fear memory reactivation. It remains, however, elusive whether any type of PE renders fear memories susceptible to reconsolidation disruption. Here, we investigated whether a value PE, elicited by an outcome that is better or worse than...
Test-retest reliability of affective bias tasks
Affective biases are commonly seen in disorders such as depression and anxiety, where individuals may show attention towards and more rapid processing of negative or threatening stimuli. Affective biases have been shown to change with effective intervention: randomized controlled trials into these biases and the mechanisms that underpin them may allow greater understanding of how interventions can be improved and their success be maximized. For trials to be informative, we must have reliable ways of measuring affective bias over time, so we can detect how interventions...
Measuring human context fear conditioning and retention after consolidation
Fear conditioning is a laboratory paradigm commonly used to investigate aversive learning and memory. In context fear conditioning, a configuration of elemental cues (conditioned stimulus, CS) predicts an aversive event (unconditioned stimulus, US). To quantify context fear acquisition in humans, previous work has used startle eye-blink responses (SEBR), skin conductance responses (SCR) and verbal reports, but different quantification methods have rarely been compared. Moreover, it is unclear how to induce, and measure context fear memory retention over several days....
Evidence integration and decision confidence are modulated by stimulus consistency
Evidence integration is a normative algorithm for choosing between alternatives with noisy evidence, which has been successful in accounting for vast amounts of behavioural and neural data. However, this mechanism has been challenged by non-integration heuristics, and tracking decision boundaries has proven elusive. Here we first show that the decision boundaries can be extracted using a model-free behavioural method termed decision classification boundary, which optimizes choice classification based on the accumulated evidence. Using this method, we provide direct...
Shuffle the decks: Children are sensitive to incidental nonrandom structure in a sequential-choice task
As children age, they can learn increasingly complex features of environmental structure-a key prerequisite for adaptive decision-making. Yet when we tested children (N = 304, 4-13 years old) in the Children’s Gambling Task, an age-appropriate variant of the Iowa Gambling Task, we found that age was negatively associated with performance. However, this paradoxical effect of age was found only in children who exhibited a maladaptive deplete-replenish bias, a tendency to shift choices after positive outcomes and repeat choices after negative outcomes. We found that...
The effect of apathy and compulsivity on planning and stopping in sequential decision-making
Real-life decision-making often comprises sequences of successive decisions about whether to take opportunities as they are encountered or keep searching for better ones instead. We investigated individual differences related to such sequential decision-making and link them especially to apathy and compulsivity in a large online sample (discovery sample: n = 449 and confirmation sample: n = 756). Our cognitive model revealed distinct changes in the way participants evaluated their environments and planned their own future behaviour. Apathy was linked to...
Motivation improves working memory by two processes: Prioritisation and retrieval thresholds
Motivation can improve performance when the potential rewards outweigh the cost of effort expended. In working memory (WM), people can prioritise rewarded items at the expense of unrewarded items, suggesting a fixed memory capacity. But can capacity itself change with motivation? Across four experiments (N = 30-34) we demonstrate motivational improvements in WM even when all items were rewarded. However, this was not due to better memory precision, but rather better selection of the probed item within memory. Motivational improvements operated independently of...
Distinct Neural Profiles of Frontoparietal Networks in Boys with ADHD and Boys with Persistent Depressive Disorder.
Working memory deficits are common in attention-deficit/hyperactivity disorder (ADHD) and depression-two common neurodevelopmental disorders with overlapping cognitive profiles but distinct clinical presentation. Multivariate techniques have previously been utilized to understand working memory processes in functional brain networks in healthy adults but have not yet been applied to investigate how working memory processes within the same networks differ within typical and atypical developing populations. We used multivariate pattern analysis (MVPA) to identify...
Learning About the Self: Motives for coherence and positivity constrain learning from self-relevant feedback
People learn about themselves from social feedback, but desires for coherence and positivity constrain how feedback is incorporated into the self-concept. We developed a network-based model of the self-concept and embedded it in a reinforcement-learning framework to provide a computational account of how motivations shape self-learning from feedback. Participants (N = 46 adult university students) received feedback while evaluating themselves on traits drawn from a causal network of trait semantics. Network-defined communities were assigned different likelihoods of...
The importance of linguistic information in human reinforcement learning
How does the nature of a stimulus affect our ability to learn appropriate response associations? In typical laboratory experiments learning is investigated under somewhat ideal circumstances, where stimuli are easily discriminable visually and linguistically. This is not representative of most real-life learning, where visually or linguistically overlapping stimuli can result in different rewards (e.g., you may learn over time that you can pet one specific dog that is friendly, but that you should avoid a very similar looking one that isn’t). With two experiments, we...
Information about task progress modulates cognitive demand avoidance
People tend to avoid engaging in cognitively demanding tasks unless it is worth our while-that is, if the benefits outweigh the costs of effortful action. Yet, we seemingly partake in a variety of effortful mental activities (e.g. playing chess, completing Sudoku puzzles) because they impart a sense of progress. Here, we examine the possibility that information about progress-specifically, the number of trials completed of a demanding cognitive control task, relative to the total number of trials to be completed-reduces individuals aversion to cognitively effort...
A passive computational marker for individual differences in non-reinforced learning
Although research about preference formation and modification has classically focused on the role of external reinforcements, there is also increasing evidence for a key role of non-externally reinforced cognitive mechanisms such as attention and memory in preference modification. In a novel paradigm for behavioral change called the Cue-Approach training (CAT) task, preferences are modified via the mere association of images of stimuli with a neutral cue and a rapid motor response, without external reinforcements. The procedure’s efficacy has been replicated across...
Influence of motor and cognitive tasks on time estimation
The passing of time can be precisely measured by using clocks, whereas humans’ estimation of temporal durations is influenced by many physical, cognitive and contextual factors, which distort our internal clock. Although it has been shown that temporal estimation accuracy is impaired by non-temporal tasks performed at the same time, no studies have investigated how concurrent cognitive and motor tasks interfere with time estimation. Moreover, most experiments only tested time intervals of a few seconds. In the present study, participants were asked to perform cognitive...
Motivational signals disrupt metacognitive signals in the human ventromedial prefrontal cortex
A growing body of evidence suggests that, during decision-making, BOLD signal in the ventromedial prefrontal cortex (VMPFC) correlates both with motivational variables - such as incentives and expected values - and metacognitive variables - such as confidence judgments - which reflect the subjective probability of being correct. At the behavioral level, we recently demonstrated that the value of monetary stakes bias confidence judgments, with gain (respectively loss) prospects increasing (respectively decreasing) confidence judgments, even for similar levels of...
Noninvasive stimulation of the ventromedial prefrontal cortex modulates rationality of human decision-making
SummaryThe framing-effect is a bias that affects decision-making depending on whether the available options are presented with positive or negative connotations. Even when the outcome of two choices is equivalent, people have a strong tendency to avoid the negatively framed option because losses are perceived about twice as salient as gains of the same amount (i.e. loss-aversion). The ventromedial prefrontal cortex (vmPFC) is crucial for rational decision-making, and dysfunctions in this region have been linked to cognitive biases, impulsive behavior and gambling...
Dopaminergic challenge dissociates learning from primary versus secondary sources of information
Some theories of human cultural evolution posit that humans have social-specific learning mechanisms that are adaptive specialisations moulded by natural selection to cope with the pressures of group living. However, the existence of neurochemical pathways that are specialised for learning from social information and individual experience is widely debated. Cognitive neuroscientific studies present mixed evidence for social-specific learning mechanisms: some studies find dissociable neural correlates for social and individual learning, whereas others find the same brain...
Stress-sensitive inference of task controllability
Estimating the controllability of the environment enables agents to better predict upcoming events and decide when to engage controlled action selection. How does the human brain estimate controllability? Trial-by-trial analysis of choices, decision times and neural activity in an explore-and-predict task demonstrate that humans solve this problem by comparing the predictions of an actor model with those of a reduced spectator model of their environment. Neural blood oxygen level-dependent responses within striatal and medial prefrontal areas tracked the instantaneous...
Time pressure changes how people explore and respond to uncertainty
How does time pressure influence exploration and decision-making? We investigated this question with several four-armed bandit tasks manipulating (within subjects) expected reward, uncertainty, and time pressure (limited vs. unlimited). With limited time, people have less opportunity to perform costly computations, thus shifting the cost-benefit balance of different exploration strategies. Through behavioral, reinforcement learning (RL), reaction time (RT), and evidence accumulation analyses, we show that time pressure changes how people explore and respond to...
Over- and Underweighting of Extreme Values in Decisions from Sequential Samples
People routinely make decisions based on samples of numerical values. A common conclusion from the literature in psychophysics and behavioral economics is that observers subjectively compress magnitudes, such that extreme values have less sway over choice than prescribed by a normative model (underweighting). However, recent studies have reported evidence for anti-compression, that is, the relative overweighting of extreme values. Here, we investigate potential reasons for this discrepancy in findings and examine the possibility that it reflects adaptive responses to...
Humans perseverate on punishment avoidance goals in multigoal reinforcement learning
Managing multiple goals is essential to adaptation, yet we are only beginning to understand computations by which we navigate the resource demands entailed in so doing. Here, we sought to elucidate how humans balance reward seeking and punishment avoidance goals, and relate this to variation in its expression within anxious individuals. To do so, we developed a novel multigoal pursuit task that includes trial-specific instructed goals to either pursue reward (without risk of punishment) or avoid punishment (without the opportunity for reward). We constructed a...
Sufficient Reliability of the Behavioral and Computational Read-Outs of a Probabilistic Reversal Learning Task
Task-based measures that capture neurocognitive processes can help bridge the gap between brain and behavior. To transfer tasks to clinical application, reliability is a crucial benchmark because it imposes an upper bound to potential correlations with other variables (e.g., symptom or brain data). However, the reliability of many task readouts is low. In this study, we scrutinized the retest reliability of a probabilistic reversal learning task (PRLT) that is frequently used to characterize cognitive flexibility in psychiatric populations. We analyzed data from N...
The effects of induced affect on Pavlovian-instrumental interactions in reinforcement learning
Across species, animals have an intrinsic drive to approach appetitive stimuli and to withdraw from aversive stimuli. In affective science, influential theories of emotion link positive affect with strengthened behavioural approach and negative affect with avoidance. Based on these theories, we predicted that individuals positive and negative affect levels should particularly influence their behaviour when innate Pavlovian approach/avoidance tendencies conflict with learned instrumental behaviours. Here, across two experiments - exploratory Experiment 1 (N =...
Model-based learning retrospectively updates model-free values
Reinforcement learning (RL) is widely regarded as divisible into two distinct computational strategies. Model-free learning is a simple RL process in which a value is associated with actions, whereas model-based learning relies on the formation of internal models of the environment to maximise reward. Recently, theoretical and animal work has suggested that such models might be used to train model-free behaviour, reducing the burden of costly forward planning. Here we devised a way to probe this possibility in human behaviour. We adapted a two-stage decision task and...
Duration reproduction under memory pressure: Modeling the roles of visual memory load in duration encoding and reproduction
Duration estimates are often biased by the sampled statistical context, yielding the classical central-tendency effect, i.e., short durations are over- and long duration underestimated. Most studies of the central-tendency bias have primarily focused on the integration of the sensory measure and the prior information, without considering any cognitive limits. Here, we investigated the impact of cognitive (visual working-memory) load on duration estimation in the duration encoding and reproduction stages. In four experiments, observers had to perform a dual,...
Neurocomputational mechanisms of confidence in self and others
Computing confidence in ones own and others decisions is critical for social success. While there has been substantial progress in our understanding of confidence estimates about oneself, little is known about how people form confidence estimates about others. Here, we address this question by asking participants undergoing fMRI to place bets on perceptual decisions made by themselves or one of three other players of varying ability. We show that participants compute confidence in another players decisions by combining distinct estimates of player ability and decision...
The Computational and Neural Substrates of Ambiguity Avoidance in Anxiety
Theoretical accounts have linked anxiety to intolerance of ambiguity. However, this relationship has not been well operationalized empirically. Here, we used computational and neuro-imaging methods to characterize anxiety-related differences in aversive decision-making under ambiguity and associated patterns of cortical activity. Adult human participants chose between two urns on each trial. The ratio of tokens (Os and Xs) in each urn determined probability of electrical stimulation receipt. A number above each urn indicated the magnitude of stimulation that would be...
Spontaneous instrumental avoidance learning in social contexts
Adaptation to our social environment requires learning how to avoid potentially harmful situations, such as encounters with aggressive individuals. Threatening facial expressions can evoke automatic stimulus-driven reactions, but whether their aversive motivational value suffices to drive instrumental active avoidance remains unclear. When asked to freely choose between different action alternatives, participants spontaneously-without instruction or monetary reward-developed a preference for choices that maximized the probability of avoiding angry individuals (sitting...
The impact of feedback on perceptual decision-making and metacognition: Reduction in bias but no change in sensitivity
It is widely believed that feedback improves behavior, but the mechanisms behind this improvement remain unclear. Different theories postulate that feedback has either a direct effect on performance through automatic reinforcement mechanisms or only an indirect effect mediated by a deliberate change in strategy. To adjudicate between these competing accounts, we performed two large experiments on human adults (total N = 518); approximately half the participants received trial-by-trial feedback on a perceptual task, whereas the other half did not receive any...
Combination and competition between path integration and landmark navigation in the estimation of heading direction
Successful navigation requires the ability to compute one’s location and heading from incoming multisensory information. Previous work has shown that this multisensory input comes in two forms: body-based idiothetic cues, from one’s own rotations and translations, and visual allothetic cues, from the environment (usually visual landmarks). However, exactly how these two streams of information are integrated is unclear, with some models suggesting the body-based idiothetic and visual allothetic cues are combined, while others suggest they compete. In this paper we...
Reward learning and working memory: Effects of massed versus spaced training and post-learning delay period
Neuroscience research has illuminated the mechanisms supporting learning from reward feedback, demonstrating a critical role for the striatum and midbrain dopamine system. However, in humans, short-term working memory that is dependent on frontal and parietal cortices can also play an important role, particularly in commonly used paradigms in which learning is relatively condensed in time. Given the growing use of reward-based learning tasks in translational studies in computational psychiatry, it is important to understand the extent of the influence of working memory...
Asymmetric reinforcement learning facilitates human inference of transitive relations
Humans and other animals are capable of inferring never-experienced relations (for example, A > C) from other relational observations (for example, A > B and B > C). The processes behind such transitive inference are subject to intense research. Here we demonstrate a new aspect of relational learning, building on previous evidence that transitive inference can be accomplished through simple reinforcement learning mechanisms. We show in simulations that inference of novel relations benefits from an asymmetric learning policy, where observers update only their...
Rewarding cognitive effort increases the intrinsic value of mental labor
Current models of mental effort in psychology, behavioral economics, and cognitive neuroscience typically suggest that exerting cognitive effort is aversive, and people avoid it whenever possible. The aim of this research was to challenge this view and show that people can learn to value and seek effort intrinsically. Our experiments tested the hypothesis that effort-contingent reward in a working-memory task will induce a preference for more demanding math tasks in a transfer phase, even though participants were aware that they would no longer receive any reward for...
The Jack and Jill adaptive working memory task: Construction, calibration and validation
Visuospatial working memory (VSWM) is essential to human cognitive abilities and is associated with important life outcomes such as academic performance. Recently, a number of reliable measures of VSWM have been developed to help understand psychological processes and for practical use in education. We sought to extend this work using Item Response Theory (IRT) and Computerised Adaptive Testing (CAT) frameworks to construct, calibrate and validate a new adaptive, computerised, and open-source VSWM test. We aimed to overcome the limitations of previous instruments and...
No need to choose: independent regulation of cognitive stability and flexibility challenges the stability-flexibility tradeoff
Adaptive behavior requires the ability to focus on a current task and protect it from distraction (cognitive stability), as well as the ability to rapidly switch to another task in light of changing circumstances (cognitive flexibility). Cognitive stability and flexibility have been conceptualized as opposite endpoints on a stability-flexibility trade-off continuum, implying an obligatory reciprocity between the two: Greater flexibility necessitates less stability, and vice versa. Surprisingly, rigorous empirical tests of this critical assumption are lacking. Here, we...
Valence biases in reinforcement learning shift across adolescence and modulate subsequent memory
As individuals learn through trial and error, some are more influenced by good outcomes, while others weight bad outcomes more heavily. Such valence biases may also influence memory for past experiences. Here, we examined whether valence asymmetries in reinforcement learning change across adolescence, and whether individual learning asymmetries bias the content of subsequent memory. Participants ages 8-27 learned the values of point machines, after which their memory for trial-unique images presented with choice outcomes was assessed. Relative to children and adults,...
Asymmetric effects of acute stress on cost and benefit learning
Humans are continuously exposed to stressful challenges in everyday life. Such stressful events trigger a complex physiological reaction - the fight-or-flight response - that can hamper flexible decision-making and learning. Inspired by key neural and peripheral characteristics of the fight-or-flight response, here, we ask whether acute stress changes how humans learn about costs and benefits. Healthy adults were randomly exposed to an acute stress (age mean=23.48, 21/40 female) or no-stress control (age mean=23.80, 22/40 female) condition, after...
Competing cognitive pressures on human exploration in the absence of trade-off with exploitation
Exploring novel environments through sequential sampling is essential for efficient decision-making under uncertainty. In the laboratory, human exploration has been studied in situations where exploration is traded against reward maximisation. By design, these ‘explore-exploit’ dilemmas confound the behavioural characteristics of exploration with those of the trade-off itself. Here we designed a sequential sampling task where exploration can be studied and compared in the presence and absence of trade-off with exploitation. Detailed model-based analyses of choice...
Optimism where there is none: Asymmetric belief updating observed with valence-neutral life events
How people update their beliefs when faced with new information is integral to everyday life. A sizeable body of literature suggests that people’s belief updating is optimistically biased, such that their beliefs are updated more in response to good news than bad news. However, recent research demonstrates that findings previously interpreted as evidence of optimistic belief updating may be the result of flaws in experimental design, rather than motivated reasoning. In light of this controversy, we conduct three pre-registered variations of the standard belief updating...
Foraging as sampling without replacement: A Bayesian statistical model for estimating biases in target selection
Foraging entails finding multiple targets sequentially. In humans and other animals, a key observation has been a tendency to forage in ‘runs’ of the same target type. This tendency is context-sensitive, and in humans, it is strongest when the targets are difficult to distinguish from the distractors. Many important questions have yet to be addressed about this and other tendencies in human foraging, and a key limitation is a lack of precise measures of foraging behaviour. The standard measures tend to be run statistics, such as the maximum run length and the number of...
Time to pay attention? Information search explains amplified framing effects under time pressure
Decades of research have established the ubiquity and importance of choice biases, such as the framing effect, yet why these seemingly irrational behaviors occur remains unknown. A prominent dual-system account maintains that alternate framings bias choices because of the unchecked influence of quick, affective processes, and findings that time pressure increases the framing effect have provided compelling support. Here, we present a novel alternative account of magnified framing biases under time pressure that emphasizes shifts in early visual attention and strategic...
Fast Evidence Accumulation in Social Anxiety Disorder Enhances Decision Making in a Probabilistic Reward Task
Choices and response times in two-alternative decision-making tasks can be modeled by assuming that individuals steadily accrue evidence in favor of each alternative until a response boundary for one of them is crossed, at which point that alternative is chosen. Prior studies have reported that evidence accumulation during decision-making tasks takes longer in adults with psychopathology than in healthy controls, indicating that slow evidence accumulation may be transdiagnostic. However, few studies have examined perceptual decision making in anxiety disorders, where...
Dissociable influences of reward and punishment on adaptive cognitive control
To invest effort into any cognitive task, people must be sufficiently motivated. Whereas prior research has focused primarily on how the cognitive control required to complete these tasks is motivated by the potential rewards for success, it is also known that control investment can be equally motivated by the potential negative consequence for failure. Previous theoretical and experimental work has yet to examine how positive and negative incentives differentially influence the manner and intensity with which people allocate control. Here, we develop and test a...
Dynamic decision policy reconfiguration under outcome uncertainty
In uncertain or unstable environments, sometimes the best decision is to change your mind. To shed light on this flexibility, we evaluated how the underlying decision policy adapts when the most rewarding action changes. Human participants performed a dynamic two-armed bandit task that manipulated the certainty in relative reward (conflict) and the reliability of action-outcomes (volatility). Continuous estimates of conflict and volatility contributed to shifts in exploratory states by changing both the rate of evidence accumulation (drift rate) and the amount of...
Sources of confidence in value-based choice
Confidence, the subjective estimate of decision quality, is a cognitive process necessary for learning from mistakes and guiding future actions. The origins of confidence judgments resulting from economic decisions remain unclear. We devise a task and computational framework that allowed us to formally tease apart the impact of various sources of confidence in value-based decisions, such as uncertainty emerging from encoding and decoding operations, as well as the interplay between gaze-shift dynamics and attentional effort. In line with canonical decision theories,...
Computational mechanisms of distributed value representations and mixed learning strategies
Learning appropriate representations of the reward environment is challenging in the real world where there are many options, each with multiple attributes or features. Despite existence of alternative solutions for this challenge, neural mechanisms underlying emergence and adoption of value representations and learning strategies remain unknown. To address this, we measure learning and choice during a multi-dimensional probabilistic learning task in humans and trained recurrent neural networks (RNNs) to capture our experimental observations. We find that human...
Four armed bandit task with reward and punishment manipulations
178 prolific workers completed an online experiment in return for monetary compensation. Participants completed a Reinforcement Learning task of four cards and two reward conditions. On each trial of the task, two cards of the four were offered by the computer, and participants were asked to pick one. Each card could lead to a reward on an independent drifting probability across trials. The difference between conditions was in whether participants won extra points or avoided the loss of points. All participants completed the OCI-R, and a partial sample also completed...