Ultra-Fast, Accurate, and Practical Anomaly Detection Algorithms for Time Series Data
Skip to main content
eScholarship
Open Access Publications from the University of California

UC Riverside

UC Riverside Electronic Theses and Dissertations bannerUC Riverside

Ultra-Fast, Accurate, and Practical Anomaly Detection Algorithms for Time Series Data

No data is associated with this publication.
Creative Commons 'BY-NC-ND' version 4.0 license
Abstract

Time Series Anomaly Detection (TSAD) has emerged as an intensely active field in data mining due to its profound implications in numerous real-world scenarios. While various solutions are proposed each year, empirical evidence indicates that time series discord, a simple distance-based technique, remains among the state-of-art methods. However, conventional algorithms to compute time series discords face limitations: they are restricted to batch cases and struggle with scalability beyond tens of thousands of datapoints. To address these challenges, we introduce DAMP, a novel algorithm that computes exact left-discords on fast-arriving streams, at up to 300,000 Hz using a commodity desktop, making it possible to find time series discords in datasets with trillions of datapoints for the first time.However, time series discords have one other notable issue; the anomalies discovered depend on the algorithm’s only input parameter, the subsequence length. To circumvent this limitation, we introduce MADRID, a Hyper-Anytime Algorithm engineered to efficiently solve the all-discords problem. By using a novel computation ordering strategy, MADRID can reduce the absolute time to compute all-discords, and allow users to interact with their data in real-time. We demonstrate the utility of MADRID in various domains and show that it allows us to find anomalies that would otherwise escape our attention. In our ongoing endeavor to make TSAD tasks practical and user-centric, we realize that we cannot overlook the user’s perspective in defining an anomaly. We posit that without accurately capturing the user’s knowledge and requirements, TSAD algorithms are likely to suffer from an influx of false positives, thus hindering their adoption. Hence, we present FIRE, a versatile framework that encapsulates the user’s requirements and communicates them to the algorithms. FIRE’s flexibility allows for implementation across different domains and algorithms. As we will show, it can make anomaly detection faster, more accurate, and more useful. In summary, our proposed algorithms, DAMP, MADRID, and FIRE, demonstrate robust performance across diverse domains and provide a robust, versatile, and efficient solution to time series anomaly detection.

Main Content

This item is under embargo until July 24, 2024.