Search

Scholarly Works (6 results)

Sort By:

Thesis
Peer Reviewed

Deep Anomaly Detection and Distribution Shifts

Li, Aodong
Advisor(s): Mandt, Stephan SM

UC Irvine Electronic Theses and Dissertations (2024)

Anomaly detection is important in various applications, from cyber-security, transportation, industry, and finance to healthcare. The anomaly detection problem is to identify anomalies originating from a different data-generating process from normal data. The rare occurrence of anomalies and their unknown causes makes it hard to collect and model them. Thus, anomaly detection methods utilize normal data to build anomaly detectors. In this dissertation, we apply deep anomaly detection methods--methods that apply deep learning techniques--to solve anomaly detection problems. We contribute multiple generic frameworks for various anomaly detection setups.

First, we challenge the common clean training data assumption (free of anomalies) and stress that practical training data is often contaminated with unnoticed anomalies. We propose a novel unsupervised training strategy for training an anomaly detector in the presence of unlabeled anomalies that is compatible with a broad class of models.

Second, selecting informative data points for expert feedback can significantly improve anomaly detection performance. The critical challenges are selecting the most informative samples for expert review and effectively incorporating their feedback to bolster anomaly detection capabilities. To address these challenges, we propose a new data labeling strategy and a new learning framework for active and semi-supervised anomaly detection.

Third, real-world applications may face distribution shifts. We consider the online learning problem where the shifts occur at unknown positions and with unknown intensities. We derive a new Bayesian online inference approach to automatically infer these distribution shifts and adapt the model to the detected changes. This approach applies to both supervised and unsupervised learning settings. We also consider the problem of adapting an anomaly detector to drift in the normal data distribution, especially when no training data is available for the “new normal.” This setting is called zero-shot anomaly detection. We propose a simple yet effective method that combines batch normalization and meta-training for zero-shot anomaly detection.

The learning frameworks introduced in this dissertation are model-agnostic and apply to various data types. Extensive experiments demonstrate the efficacy of our proposed approaches.

Cover page: Deep Anomaly Detection and Distribution Shifts

Creative Commons 'BY' version 4.0 license

Thesis
Peer Reviewed

On the Efficient Marginalization of Probabilistic Sequence Models

UC Irvine Electronic Theses and Dissertations (2024)

Real-world data often exhibits sequential dependence, across diverse domains such as human behavior, medicine, finance, and climate modeling. Probabilistic methods capture the inherent uncertainty associated with prediction in these contexts, with autoregressive models being especially prominent. This dissertation focuses on using autoregressive models to answer complex probabilistic queries that go beyond single-step prediction, such as the timing of future events or the likelihood of a specific event occurring before another. In particular, we develop a broad class of novel and efficient approximation techniques for marginalization in sequential models that are model-agnostic. These techniques rely solely on access to and sampling from next-step conditional distributions of a pre-trained autoregressive model, including both traditional parametric models as well as more recent neural autoregressive models. Specific approaches are presented for discrete sequential models, for marked temporal point processes, and for stochastic jump processes, each tailored to a well-defined class of informative, long-range probabilistic queries.

Cover page: On the Efficient Marginalization of Probabilistic Sequence Models

Article
Peer Reviewed

Fully Bayesian Autoencoders with Latent Sparse Gaussian Processes.

UC Irvine Previously Published Works (2023)

We present a fully Bayesian autoencoder model that treats both local latent variables and global decoder parameters in a Bayesian fashion. This approach allows for flexible priors and posterior approximations while keeping the inference costs low. To achieve this, we introduce an amortized MCMC approach by utilizing an implicit stochastic network to learn sampling from the posterior over local latent variables. Furthermore, we extend the model by incorporating a Sparse Gaussian Process prior over the latent space, allowing for a fully Bayesian treatment of inducing points and kernel hyperparameters and leading to improved scalability. Additionally, we enable Deep Gaussian Process priors on the latent space and the handling of missing data. We evaluate our model on a range of experiments focusing on dynamic representation learning and generative modeling, demonstrating the strong performance of our approach in comparison to existing methods that combine Gaussian Processes and autoencoders.

Cover page: Fully Bayesian Autoencoders with Latent Sparse Gaussian Processes.

Article
Peer Reviewed

Learning to simulate high energy particle collisions from unlabeled data

UC Irvine Previously Published Works (2022)

In many scientific fields which rely on statistical inference, simulations are often used to map from theoretical models to experimental data, allowing scientists to test model predictions against experimental results. Experimental data is often reconstructed from indirect measurements causing the aggregate transformation from theoretical models to experimental data to be poorly-described analytically. Instead, numerical simulations are used at great computational cost. We introduce Optimal-Transport-based Unfolding and Simulation (OTUS), a fast simulator based on unsupervised machine-learning that is capable of predicting experimental data from theoretical models. Without the aid of current simulation information, OTUS trains a probabilistic autoencoder to transform directly between theoretical models and experimental data. Identifying the probabilistic autoencoder's latent space with the space of theoretical models causes the decoder network to become a fast, predictive simulator with the potential to replace current, computationally-costly simulators. Here, we provide proof-of-principle results on two particle physics examples, Z-boson and top-quark decays, but stress that OTUS can be widely applied to other fields.

Cover page: Learning to simulate high energy particle collisions from unlabeled data

Article
Peer Reviewed

Generative Modeling of Atmospheric Convection

UC Irvine Previously Published Works (2020)

While cloud-resolving models can explicitly simulate the details of small-scale storm formation and morphology, these details are often ignored by climate models for lack of computational resources. Here, we explore the potential of generative modeling to cheaply recreate small-scale storms by designing and implementing a Variational Autoencoder (VAE) that performs structural replication, dimensionality reduction, and clustering of high-resolution vertical velocity fields. Trained on ∼6 · 106 samples spanning the globe, the VAE successfully reconstructs the spatial structure of convection, performs unsupervised clustering of convective organization regimes, and identifies anomalous storm activity, confirming the potential of generative modeling to power stochastic parameterizations of convection in climate models.

Cover page: Generative Modeling of Atmospheric Convection

Article
Peer Reviewed

Comparing storm resolving models and climates via unsupervised machine learning

UC Irvine Previously Published Works (2023)

Global storm-resolving models (GSRMs) have gained widespread interest because of the unprecedented detail with which they resolve the global climate. However, it remains difficult to quantify objective differences in how GSRMs resolve complex atmospheric formations. This lack of comprehensive tools for comparing model similarities is a problem in many disparate fields that involve simulation tools for complex data. To address this challenge we develop methods to estimate distributional distances based on both nonlinear dimensionality reduction and vector quantization. Our approach automatically learns physically meaningful notions of similarity from low-dimensional latent data representations that the different models produce. This enables an intercomparison of nine GSRMs based on their high-dimensional simulation data (2D vertical velocity snapshots) and reveals that only six are similar in their representation of atmospheric dynamics. Furthermore, we uncover signatures of the convective response to global warming in a fully unsupervised way. Our study provides a path toward evaluating future high-resolution simulation data more objectively.

Cover page: Comparing storm resolving models and climates via unsupervised machine learning