Listeners track distributions of speech sounds along percep-tual dimensions. We introduce a method for evaluating hy-potheses about what those dimensions are, using a cognitivemodel whose prior distribution is estimated directly from speechrecordings. We use this method to evaluate two speaker nor-malization algorithms against human data. Simulations showthat representations that are normalized across speakers predicthuman discrimination data better than unnormalized representa-tions, consistent with previous research. Results further revealdifferences across normalization methods in how well eachpredicts human data. This work provides a framework forevaluating hypothesized representations of speech and lays thegroundwork for testing models of speech perception on naturalspeech recordings from ecologically valid settings.