People often navigate new environments and must learn abouthow actions map to outcomes to achieve their goals. In this pa-per, we are concerned with how people direct their search andtrade off between selecting informative actions and actions thatwill be most immediately rewarding when they are faced withnew tasks. We find that some people selected globally infor-mative actions and were able to generalize from few observa-tions in order learn new reward structures efficiently. Theseparticipants also displayed the ability to transfer knowledgeacross similar tasks. However, a consistent proportion of par-ticipants behaved sub-optimally, caring more about observingnovel information instead of maximizing reward. Across fourexperiments, we present evidence that participants’ motivationto explore was influenced by 1) how much they already knewabout the underlying task structure and 2) whether their obser-vations remained available. We discuss possible explanationsbehind people’s exploratory drive.