This paper explores the minimal knowledge a listener needs tocompensate for phonological assimilation, one kind of phono-logical process responsible for variation in speech. We usedstandard automatic speech recognition models to represent En-glish and French listeners. We found that, first, some typesof models show language-specific assimilation patterns com-parable to those shown by human listeners. Like English lis-teners, when trained on English, the models compensate morefor place assimilation than for voicing assimilation, and likeFrench listeners, the models show the opposite pattern whentrained on French. Second, the models which best predict thehuman pattern use contextually-sensitive acoustic models andlanguage models, which capture allophony and phonotactics,but do not make use of higher-level knowledge of a lexiconor word boundaries. Finally, some models overcompensate forassimilation, showing a (super-human) ability to recover theunderlying form even in the absence of the triggering phono-logical context, pointing to an incomplete neutralization notexploited by human listeners.