Problems and Solutions: Machine Learning Approaches for a Dynamic Ocean
Skip to main content
eScholarship
Open Access Publications from the University of California

UC San Diego

UC San Diego Electronic Theses and Dissertations bannerUC San Diego

Problems and Solutions: Machine Learning Approaches for a Dynamic Ocean

Abstract

Advancements in observational methods and data collection techniques have empowered oceanographers to gather extensive data on a wide range of oceanic phenomena. Optical imaging systems have provided unprecedented insight into the microscopic world of marine plankton as well as the structure, health, biodiversity, and ecological dynamics of coral reefs. Advances in low-power autonomous acoustic recording devices have enabled continuous long-term monitoring of marine mammals and ocean noise.These data-driven methods involve the collection, analysis, and interpretation of large datasets to gain insights. Although machine learning offers the potential for automating the analysis of large oceanographic datasets, its utilization in this context is accompanied by challenges and problems due to the high spatiotemporal variability and noise inherent in these datasets. This thesis delves into an extensive exploration of state-of-the-art machine learning techniques, specifically tailored to optimize the extraction of valuable information from dynamic oceanographic datasets. To obtain a comprehensive understanding of the problem, instances of dataset shift and noise are examined in three distinct case studies spanning the vision and acoustic domains. The first case study focuses on the problem of novelty detection and class imbalance in the context of plankton image recognition using Images from the WHOI-Plankton dataset. The second case study explores the problem of object detection when samples are collected from different environments or under varying conditions. Lastly, the third case study aims to develop multi-observational techniques to reduce dataset noise using a dataset of acoustic recordings collected in the Santa Barbara Channel. In each case, the core technical goal is the same: to train a convolutional neural network-based system to learn a robust feature representation that generalizes to unforeseen environmental conditions. To achieve this goal, techniques from the field of hard negative mining, unsupervised domain adaptation, and multi-view learning are integrated into the workflows. Ultimately, my overarching objective is to drive advancements in the development of robust oceanographic data automation tools.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View