Skip to main content
eScholarship
Open Access Publications from the University of California

Similarity-based compression with multidimensional pattern matching

Published Web Location

https://sdm.lbl.gov/oapapers/snta19-delguercio-final.pdf
No data is associated with this publication.
Abstract

Sensors typically record their measurements using more precision than the accuracy of the sensing techniques. Thus, experimental and observational data often contain noise that appears random and cannot be easily compressed. This noise increases storage requirement as well as computation time for analyses. In this work, we describe a line of research to develop data reduction techniques that preserve the key features while reducing the storage requirement. Our core observation is that the noise in such cases could be characterized by a small number of patterns based on statistical similarity. In earlier tests, this approach was shown to reduce the storage requirement by over 100-fold for one-dimensional sequences. In this work, we explore a set of different similarity measures for multidimensional sequences. During our tests with standard quality measures such as Peak Signal to Noise Ratio (PSNR), we observe that the new compression methods reduce the storage requirements over 100-fold while maintaining relatively low errors in PSNR. Thus, we believe that this is an effective strategy to construct data reduction techniques.

Item not freely available? Link broken?
Report a problem accessing this item