Most prosocial and antisocial behaviors affect ourselves and others simultaneously. To know whether to repeat choices that help or harm, we must learn from their outcomes. But the neurocomputational processes supporting such simultaneous learning remain poorly understood. In this pre-registered study, two independent samples (N=89) learned to make choices that simultaneously affected themselves and another person. Detailed model comparison showed that people integrate self- and other-relevant information into a single cached value per choice, but update this value asymmetrically based on different types of prediction errors related to the target (e.g., self, other) and valence (e.g., positive, negative). People who acquire more prosocial patterns are more sensitive to information about how their choices affect others. However, those with higher levels of subclinical psychopathic traits are relatively insensitive to unexpected outcomes for others and more sensitive for themselves. Model-based neuroimaging revealed distinct brain regions tracking prediction errors guided by the asymmetric value update. These results demonstrate that the way people distinctly encode self- and other-relevant outcomes resulting from a particular behavior guides how desirable the same behavior will be in the future, regardless of whether it is mutually beneficial or costly, instrumentally harmful, or altruistic.