Skip to main content
eScholarship
Open Access Publications from the University of California

UC Davis

UC Davis Previously Published Works bannerUC Davis

Exploring predictive states via Cantor embeddings and Wasserstein distance

Published Web Location

https://doi.org/10.1063/5.0102603Creative Commons 'BY-NC-ND' version 4.0 license
Abstract

Predictive states for stochastic processes are a nonparametric and interpretable construct with relevance across a multitude of modeling paradigms. Recent progress on the self-supervised reconstruction of predictive states from time-series data focused on the use of reproducing kernel Hilbert spaces. Here, we examine how Wasserstein distances may be used to detect predictive equivalences in symbolic data. We compute Wasserstein distances between distributions over sequences ("predictions") using a finite-dimensional embedding of sequences based on the Cantor set for the underlying geometry. We show that exploratory data analysis using the resulting geometry via hierarchical clustering and dimension reduction provides insight into the temporal structure of processes ranging from the relatively simple (e.g., generated by finite-state hidden Markov models) to the very complex (e.g., generated by infinite-state indexed grammars).

Many UC-authored scholarly publications are freely available on this site because of the UC's open access policies. Let us know how this access is important for you.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View