Analogies to stochastic optimization are common in developmental psychology, describing a gradual reduction in randomness (cooling off) over the lifespan. Yet for lack of concrete empirical comparison, there is ambiguity in interpreting this analogy. Using data from n=281 participants ages 5 to 55, we show that cooling off does not only apply to the single dimension of randomness. Rather, development resembles an optimization process along multiple dimensions of learning (i.e., reward generalization, uncertainty-directed exploration, and random temperature). What begins as large tweaks in the parameters that define learning during childhood plateaus and converges to efficient parameter constellations in adulthood. The developmental trajectory of human parameters is strikingly similar to several stochastic optimization algorithms, yet we observe intriguing differences in convergence. Notably, none of the optimization algorithms discovered reliably better regions of the strategy space than adult participants, suggesting a remarkable efficiency of human development.