How do people build up trust with artificial agents? Here, we study a key component of interpersonal trust: people's ability to evaluate the competence of another agent across repeated interactions. Prior work has largely focused on appraisal of simple, static skills; in contrast, we probe competence evaluations in a rich setting with agents that learn over time. Participants played a video game involving physical reasoning paired with one of four artificial agents that suggested moves each round. We measure participants' decisions to accept or revise their partner's suggestions to understand how people evaluated their partner's ability. Overall, participants collaborated successfully with their agent partners; however, when revising their partner's suggestions, people made sophisticated inferences about the competence of their partner from prior behavior. Results provide a quantitative measure of how people integrate a partner's competence into their own decisions and may help facilitate better coordination between humans and artificial agents.