Recent research has placed episodic reinforcement learning(RL) alongside model-free and model-based RL on the list ofprocesses centrally involved in human reward-based learning.In the present work, we extend the unified account of model-free and model-based RL developed by Wang et al. (2017) tofurther integrate episodic learning. In this account, a genericmodel-free "meta-learner" learns to deploy and coordinate allof these RL algorithms. The meta-learner is trained on a broadset of novel tasks with limited exposure to each task, suchthat it learns to learn about new tasks. We show that whenequipped with an episodic memory system inspired by theoriesof reinstatement and gating, the meta-learner learns to use thesame pattern of episodic, model-free, and model-based RLobserved in humans in a task designed to dissociate among theinfluences of these learning algorithms. We discuss implicationsand predictions of the model.