Metacontrol of reinforcement learning
Modern theories of reinforcement learning posit two systems competing for control of behavior: a "model-free" or "habitual" system that learns cached state-action values, and a "model-based" or "goal-directed" system that learns a world model which is then used to plan actions. I will argue that humans can adaptively invoke model-based computation when its benefits outweigh its costs. A simple meta-control learning rule can capture the dynamics of this cost-benefit analysis. Neuroimaging evidence points to the role of cognitive control regions in this computation.

