Skip to main content
eScholarship
Open Access Publications from the University of California

Demystify Deep-learning AI for Object Detection using Human Attention Data

Creative Commons 'BY' version 4.0 license
Abstract

Here we present a new Explainable AI (XAI) method to probe the functional partition in AI models by comparing features attended to at different layers with human attention driven by diverse task demands. We applied this method to explain an object detector Yolo-v5s in multi-category and single-category object detection tasks. We found that the model's neck showed higher similarity to human attention during object detection, indicating a reliance on diagnostic features in the neck, whereas its backbone showed higher similarity to attention during passive viewing, indicating salient local features encoded. With this understanding of its functional partition, using Yolo-v5s as a model for human cognition, our comparative analysis against human attention when providing explanations for object detection revealed that humans attended to a combination of diagnostic and salient features during explaining multi-category general object detection but attended to mainly diagnostic features when explaining single-category human/vehicle detection in driving scenarios.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View