Having low confidence in a decision can justify the costly search for extra information. Rich literatures have separately modelled the metacognitive monitoring processes involved in confidence formation and the control processes guiding search, but these two processes have yet to be treated in unison. Here, we model the two as inference and action in a unified partially-observable Markov decision problem where decision confidence is generated by more sophisticated postdecisional or second-order models. Our work highlights how different metacognitive monitoring architectures generate diverse relationships between object- and meta-level accuracy as well as normative information collection in the face of costs. In particular, we demonstrate that decreased metacognitive efficiency prescribes both increased and decreased search, depending on the underlying model of metacognitive confidence. More broadly, our work shows how it is crucial to model interactions between metacognitive monitoring and control, whether in information search or beyond.