Skip to main content
eScholarship
Open Access Publications from the University of California

How Well Do Deep Learning Models Capture Human Concepts? The Case of the Typicality Effect

Abstract

This study examines the alignment of deep learning model representations with those of humans, focusing on the typicality effect, where certain instances of a category are considered more representative than others. Previous research, limited to single modalities and few concepts, showed modest correlations with human typicality judgments. This study expands the scope by evaluating a wider array of language (N=8) and vision (N=10) models. It also considers the combined predictions of language+vision model pairs, alongside a multimodal CLIP-based model. The investigation encompasses a larger concept range (N=27) than prior work. Our findings reveal that language models align more closely with human typicality judgments than vision models. Additionally, combined language+vision models, as well as the multimodal CLIP model, demonstrate improved prediction of human typicality data. This study advances the understanding of ML models' conceptual alignment with human cognition and contributes a new image set for vision model concept evaluation.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View