Novel Primitive for Analyzing Massive Time Series Data Set
- Author(s): Imani, Shima
- Advisor(s): Keogh, Eamonn
- et al.
There has been huge progress in the time series domain. Every day, a large volume of time series data is gathered by monitoring industrial processes, tracking human behavior, monitoring the human health data, etc.To analyze and extract useful information from the time series data, we can use different techniques such as finding motifs, summarization, searching, and clustering. Using these techniques, we can extract high-quality information in much less time than humans in principal could examine over many years.
In this thesis, we demonstrate all these techniques and discuss the related work and our contributions. For example, Query-based similarity search is a useful exploratory method that has been used in many areas such as music, economics, and biology to find common patterns and behaviors. Existing query-based search methods allow users to search large time series collections. However, these methods are not very robust and they often fail to find similar patterns. We present a natural language search framework for finding similar patterns in time series which provides the capability of searching a time series more directly and intuitively. Another technique that we will demonstrate is finding representative patterns or summarization. Summarization is the technique of shortening the data to create a summary having only the main points outlined in the data. This technique can be used for edge computing and we can analyze and store summaries instead of the whole time series. We demonstrate the utility of our ideas in domains as diverse as animal behavior, entomology, human activity monitoring, electrical power-demand monitoring and medicine.