Search

Scholarly Works (19 results)

Sort By:

Show:

Article
Peer Reviewed

Modeling, clustering, and segmenting video with mixtures of dynamic textures

UC San Diego Previously Published Works (2008)

Adynamic texture is a spatio-temporal generative model for video, which represents video sequences as observations from a linear dynamical system. This work studies the mixture of dynamic textures, a statistical model for an ensemble of video sequences that is sampled from a finite collection of visual processes, each of which is a dynamic texture. An expectation-maximization (EM) algorithm is derived for learning the parameters of the model, and the model is related to previous works in linear systems, machine learning, time-series clustering, control theory, and computer vision. Through experimentation, it is shown that the mixture of dynamic textures is a suitable representation for both the appearance and dynamics of a variety of visual processes that have traditionally been challenging for computer vision (for example, fire, steam, water, vehicle and pedestrian traffic, and so forth). When compared with state-of-the-art methods in motion segmentation, including both temporal texture methods and traditional representations (for example, optical flow or other localized motion representations), the mixture of dynamic textures achieves superior performance in the problems of clustering and segmenting video of such processes.

Cover page: Modeling, clustering, and segmenting video with mixtures of dynamic textures

Article
Peer Reviewed

EMHMM: Eye Movement Analysis with Hidden Markov Models and ItsApplications in Cognitive Research

Proceedings of the Annual Meeting of the Cognitive Science Society, Volume 41 (2019)

Article
Peer Reviewed

The Role of Eye Movement Consistency in Learning to Recognise Faces:Computational and Experimental Examinations

Proceedings of the Annual Meeting of the Cognitive Science Society, Volume 42 (2020)

Creative Commons 'BY' version 4.0 license

Article
Peer Reviewed

Hidden Markov model analysis reveals better eye movement strategies in face recognition

Proceedings of the Annual Meeting of the Cognitive Science Society, Volume 37 (2015)

Here we explored eye movement strategies that lead to better performance in face recognition with hidden Markov models (HMMs). Participants performed a standard face recognition memory task with eye movements recorded. The durations and locations of the fixations were analyzed using HMMs for both the study and the test phases. Results showed that in the study phase, the participants who looked more often at the eyes and shifted between different regions on the face with long fixation durations had better performances. The test phase analyses revealed that an efficient, short first orienting fixation followed by a more analytic pattern focusing mainly on the eyes led to better performances. These strategies could not be revealed by analysis methods that do not take individual differences in both temporal and spatial dimensions of eye movements into account, demonstrating the power of the HMM approach.

Cover page: Hidden Markov model analysis reveals better eye movement strategies in face
recognition

Article
Peer Reviewed

Supervised learning of semantic classes for image annotation and retrieval

UC San Diego Previously Published Works (2007)

A probabilistic formulation for semantic image annotation and retrieval is proposed. Annotation and retrieval are posed as classification problems where each class is defined as the group of database images labeled with a common semantic label. It is shown that, by establishing this one-to-one correspondence between semantic labels and semantic classes, a minimum probability of error annotation and retrieval are feasible with algorithms that are 1) conceptually simple, 2) computationally efficient, and 3) do not require prior semantic segmentation of training images. In particular, images are represented as bags of localized feature vectors, a mixture density estimated for each image, and the mixtures associated with all images annotated with a common semantic label pooled into a density estimate for the corresponding semantic class. This pooling is justified by a multiple instance learning argument and performed efficiently with a hierarchical extension of expectation-maximization. The benefits of the supervised formulation over the more complex, and currently popular, joint modeling of semantic label and visual feature distributions are illustrated through theoretical arguments and extensive experiments. The supervised formulation is shown to achieve higher accuracy than various previously published methods at a fraction of their computational cost. Finally, the proposed method is shown to be fairly robust to parameter tuning.

Cover page: Supervised learning of semantic classes for image annotation and retrieval

Article
Peer Reviewed

Human Attention-Guided Explainable AI for Object Detection

Proceedings of the Annual Meeting of the Cognitive Science Society, Volume 45 (2023)

Although object detection AI plays an important role in many critical systems, corresponding Explainable AI (XAI) methods remain very limited. Here we first developed FullGrad-CAM and FullGrad-CAM++ by extending traditional gradient-based methods to generate object-specific explanations with higher plausibility. Since human attention may reflect features more in-terpretable to humans, we explored the possibility to use it as guidance to learn how to combine the explanatory information in the detector model to best present as an XAI saliency map that is interpretable (plausible) to humans. Interestingly, we found that human attention maps had higher faithfulness for explaining the detector model than existing saliency-based XAI methods. By using trainable activation functions and smoothing kernels to maximize the XAI saliency map similarity to human attention maps, the generated map had higher faithfulness and plausibility than both existing XAI methods and human atten-tion maps. The learned functions were model-specific, well generalizable to other databases.

Cover page: Human Attention-Guided Explainable AI for Object Detection

Article
Peer Reviewed

Is Holistic Processing Associated with Face Scanning Pattern and Performance in Face Recognition? Evidence from Deep Neural Network with Hidden Markov Modeling

Proceedings of the Annual Meeting of the Cognitive Science Society, Volume 46 (2024)

Here we used deep neural network + hidden Markov model (DNN+HMM) to provide a computational account for the relationship among holistic processing (HP), face scanning pattern and face recognition performance. The model accounted for the positive associations between HP and eyes-focused face scanning pattern/face recognition performance observed in the literature regardless of the version of the composite task used to measure HP. Interestingly, we observed a quadratic relationship between HP and face scanning pattern, where models being highly eyes-focused or highly nose-focused had lower HP. By inspecting fixation locations and associated attention window size in the model and XAI methods, we found that the eyes- and nose-focused models both developed local and holistic internal representations during training, and their difference was in the temporal dynamics of how these representations were used. Our findings demonstrated how computational modeling could unravel the mechanisms underlying cognition not readily observable in human data.

Cover page: Is Holistic Processing Associated with Face Scanning Pattern and Performance in Face Recognition? Evidence from Deep Neural Network with Hidden Markov Modeling

Article
Peer Reviewed

Hidden Markov Modeling of eye movements with image information leads to betterdiscovery of regions of interest

Proceedings of the Annual Meeting of the Cognitive Science Society, Volume 38 (2016)

Hidden Markov models (HMM) can describe the spatial andtemporal characteristics of eye-tracking recordings incognitive tasks. Here, we introduce a new HMM approach.We developed HMMs based on fixation locations and we alsoused image information as an input feature. We demonstratethe benefits of the newly proposed model in a facerecognition study wherein an HMM was developed for everysubject. Discovery of regions of interest on facial stimuli isimproved as compared with earlier approaches. Moreover,clustering of the newly developed HMMs lead to very distinctgroups. The newly developed approach also allowsreconstructing image information at each fixation.

Cover page: Hidden Markov Modeling of eye movements with image information leads to betterdiscovery of regions of interest

Article
Peer Reviewed

Understanding eye movements in face recognition with hidden Markov model

Proceedings of the Annual Meeting of the Cognitive Science Society, Volume 35 (2013)

Article

Optimal face recognition performance involves a balance between global and local information processing: Evidence from cultural difference

Proceedings of the Annual Meeting of the Cognitive Science Society, Volume 40 (2018)

In face recognition, eye gaze to the eye region is reported to be associated with better performance than to the center of a face. Nevertheless, Caucasians and Asians differ in how much they look at the eyes when they scan a face, but have comparable identification performance. To resolve this issue, here we test the hypothesis that optimal face recognition performance involves a balance between global and local face processing. Thus, Asians may benefit from enhancement of local processing and vice versa for Caucasians. We showed that local attention priming using hierarchical letter stimuli led to more eye-focused eye movement patterns compared to global attention priming in both Asians and Caucasians. However, Asians had better performance after local priming than global priming, whereas Caucasian showed the opposite effect. These results suggest that engagement of global/local attention leads to face-center/eye biased eye movements respectively, and optimal recognition performance involves both global and local processing/gaze transitions between the face center and eyes.

Cover page: Optimal face recognition performance involves a balance between global and local information processing: Evidence from cultural difference