Skip to main content
eScholarship
Open Access Publications from the University of California

UC Davis

UC Davis Electronic Theses and Dissertations bannerUC Davis

A Visual Analytics Exploratory and Predictive Framework for Anomaly Detection in Multi-fidelity Machine Log Data

Abstract

Maintaining robust and reliable computing systems, especially those that enable breakthrough work in computational science and engineering research, is a critical and challenging task. This dissertation’s goal is to build both an exploratory mechanism for pattern identification from historical data and a predictive tool for identifying the large-scale systems’ state with a visual analytics framework. Toward this goal, the work processes various system logs such as error logs, job logs, syslogs, console logs, and environment logs. To process environment log data that captures time-dependent phenomena, the work uses functional data analysis (FDA) to use some of FDA’s benefits, such as the ability to study the sensitivity to change and maintain the data ordering. The visual analytics approach developed in this dissertation helps simultaneously monitor and review the changing time-series data by using new incremental and progressive FDA algorithms to promptly generate results for streaming time-series data, thus addressing the computational cost problems prevalent in FDA. A scalable visual analytics tool, MELA, identifies patterns and gleans insights from these diverse logs to effectively characterize system behavior and faults over time. A visual analytics machine learning pipeline promptly predicts a user application’s exit status and potential errors. The dissertation also introduces a visual analytics solution for data exploration at varying temporal and spatial resolutions by extracting ranges of frequencies from environment/hardware logs using multiresolution dynamic mode decomposition (mrDMD). These frequencies are extracted at multiple resolutions in time and analyzed at each resolution allowing for coarse-grained (over the years) to fine-grained (over hours or minutes) analysis of the time-series data. This dissertation thus introduces faster, scalable, and interactive visual analytics solutions utilizing multiscale, exploratory, predictive, and multiresolution analyses of diverse large-scale system logs, bridging the gap between visual analytics and machine log analysis.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View