In reinforcement learning tasks, people learn the values of options relative to other options in the local context. Prior research suggests that relative value learning is enhanced when choice contexts are temporally clustered in a blocked sequence compared to a randomly interleaved sequence. The present study was aimed at further investigating the effects of blocked versus interleaved training using a choice task that distinguishes among different contextual encoding models. Our results showed that the presentation format in which contexts are experienced can lead to qualitatively distinct forms of relative value learning. This conclusion was supported by a combination of model-free and model-based analyses. In the blocked condition, choice behavior was most consistent with a reference point model in which outcomes are encoded relative to a dynamic estimate of the contextual average reward. In contrast, the interleaved condition was best described by a range-frequency encoding model. We propose that blocked training makes it easier to track contextual outcome statistics, such as the average reward, which may then be used to relativize the values of experienced outcomes. When contexts are interleaved, range-frequency encoding may serve as a more efficient means of storing option values in memory for later retrieval.