Our decisions are accompanied by a sense of confidence, a metacognitive assessment of how likely those decisions are to be correct, but the mechanisms that underlie this capacity remain poorly understood. A number of recent behavioral and neural data have suggested that decisions are made in accord with an optimal `balance-of-evidence' rule, whereas confidence is estimated using a heuristic `response-congruent-evidence' rule. We developed a deep neural network model optimized to classify images and predict its own likelihood of being correct, and found that this model naturally accounts for some of the key behavioral dissociations between decisions and confidence ratings. Further investigation revealed that neither the `balance-of-evidence' rule nor the `response-congruent-evidence' rule fully characterized the strategy that the model learned. We argue instead that the model learns to flexibly approximate the distribution of its training data, and, analogously, that apparently suboptimal features of human confidence ratings may arise from optimization for the statistics of naturalistic settings.