Search

Scholarly Works (4 results)

Sort By:

Article
Peer Reviewed

Mooney Face Classification and Prediction by Learning Across Tone

UC Berkeley Previously Published Works (2017)

Mooney faces are special two-tone images that elicit a rich impression of identity and facial expression in human observers. While Mooney faces are important, there exist only a small number of instances hand-crafted from source photos which are often no longer available. We first apply deep learning methods to generate a plausible Mooney face automatically from any face photo. We are then able to create a large-scale face dataset with paired grayscale and two-tone images. We then study how well two-tone versions make face predictions, using conditional Generative Adversarial Networks. We show that faces predicted from Mooney images bear striking resemblance to source photos, and they are better than two-tone images obtained by global intensity thresholding. We also demonstrate remarkable face predictions from very low resolution surveillance photos. Our findings reveal great potentials of combining deep learning and Mooney faces for more effective face recognition in a wide range of conditions.

Cover page: Mooney Face Classification and Prediction by Learning Across Tone

Thesis
Peer Reviewed

Learning Visual Groupings and Representations with Minimal Human Labels

UC Berkeley Electronic Theses and Dissertations (2022)

Making a computer system understand complex image scenes is challenging. Complex imagescenes often have multiple objects, which are not isolated but related to each other in different aspects. Identifying certain object categories may not be enough to understand complex scenes. Categories have multiple granularities. We need such knowledge to capture semantic correlation thoroughly. In addition, objects have numerous interactions/relationships. We need to localize these objects, recognize scene environments, and figure out their interaction- s/relationships. In computer vision, recognizing what the categories are, where the objects are, and how objects interact to each other is often formulated as the classification, segmentation, and relationship recognition problem.

Existing approaches often tackle all these formulations in supervised settings. Despite theirtremendous progress, we identify three major limitations. 1) Human annotation is too time- and labor-consuming to scale up to real-world scenarios. 2) The sets of human labels are pre-selected arbitrarily, providing limited/biased perspectives to understand images. 3) Such supervised methods conduct inference in terms of discrete labeling. They isolate labels from each other, ignoring the similarity/dissimilarity among each other. Also, they can only put images to the known labels seen during training and fail to recognize novel images sampled from unknown labels during testing.

In this dissertation, we address the issues of current supervised approaches by replacingdiscrete labeling with grouping and using minimal human labels. Specifically, we tackle the recognition problem from four perspectives. 1) We address weakly-supervised semantic segmentation, where partial semantic pixel labels are used. 2) We address unsupervised semantic segmentation, where only low-level edge detections are used. 3) We address unsupervised concurrent image classification and segmentation in a single framework, where our model does not use any human labels. 4) We address unsupervised human-object recognition, where semantic and instance pixels labels, no relationship labels, are used. This dissertation explores more general and robust approaches to understanding the highly-complex and fast-changing real-world scene.

Cover page: Learning Visual Groupings and Representations with Minimal Human Labels

Article
Peer Reviewed

A convolutional neural network‐based screening tool for X‐ray serial crystallography

UC Berkeley Previously Published Works (2018)

A new tool is introduced for screening macromolecular X-ray crystallography diffraction images produced at an X-ray free-electron laser light source. Based on a data-driven deep learning approach, the proposed tool executes a convolutional neural network to detect Bragg spots. Automatic image processing algorithms described can enable the classification of large data sets, acquired under realistic conditions consisting of noisy data with experimental artifacts. Outcomes are compared for different data regimes, including samples from multiple instruments and differing amounts of training data for neural network optimization.

Article
Peer Reviewed

Deep learning workflow to support in-flight processing of digital aerial imagery for wildlife population surveys.

UC Berkeley Previously Published Works (2024)

Deep learning shows promise for automating detection and classification of wildlife from digital aerial imagery to support cost-efficient remote sensing solutions for wildlife population monitoring. To support in-flight orthorectification and machine learning processing to detect and classify wildlife from imagery in near real-time, we evaluated deep learning methods that address hardware limitations and the need for processing efficiencies to support the envisioned in-flight workflow. We developed an annotated dataset for a suite of marine birds from high-resolution digital aerial imagery collected over open water environments to train the models. The proposed 3-stage workflow for automated, in-flight data processing includes: 1) image filtering based on the probability of any bird occurrence, 2) bird instance detection, and 3) bird instance classification. For image filtering, we compared the performance of a binary classifier with Mask Region-based Convolutional Neural Network (Mask R-CNN) as a means of sub-setting large volumes of imagery based on the probability of at least one bird occurrence in an image. On both the validation and test datasets, the binary classifier achieved higher performance than Mask R-CNN for predicting bird occurrence at the image-level. We recommend the binary classifier over Mask R-CNN for workflow first-stage filtering. For bird instance detection, we leveraged Mask R-CNN as our detection framework and proposed an iterative refinement method to bootstrap our predicted detections from loose ground-truth annotations. We also discuss future work to address the taxonomic classification phase of the envisioned workflow.

Cover page: Deep learning workflow to support in-flight processing of digital aerial imagery for wildlife population surveys.