 Main
Structured Tensors and the Geometry of Data
 Author(s): Seigal, Anna Leah
 Advisor(s): Sturmfels, Bernd
 et al.
Abstract
We analyze data to build a quantitative understanding of the world. Linear algebra is the foundation to algorithms, dating back one hundred years, for extracting structure from data. Modern technologies provide an abundance of multidimensional data, in which multiple variables or factors can be compared simultaneously. To organize and analyze such datasets we can use a tensor, the higher order analogue of a matrix. However, many theoretical and practical challenges arise in extending linear algebra to the setting of tensors.
In the first part of this thesis, I study and develop the algebraic theory of tensors. Tensors of low real rank, as well as singular vectors and singular values of tensors, parametrize semialgebraic sets, defined by polynomial equations and inequalities. I give exact algebraic characterizations of sets of tensors of interest, using real algebraic geometry, polyhedral geometry and computer algebra. I obtain a membership test for the set of real rank two tensors, I describe the variety of singular vectors of orthogonally decomposable tensors, and I obtain inequalities relating the singular values of the flattenings of a tensor. I show that rank and symmetric rank coincide for tensors of low symmetric rank, bounded by seven. I conclude by describing tensor hypernetworks, a flexible framework for decomposing and approximating tensors in applicationspecific contexts. Throughout these results, an indepth study of small tensors is used to set up the general theory. The theoretical results explain pitfalls of existing tensor algorithms, and also suggest new approaches for finding structure in a tensor.
In the second part of this thesis, I present three algorithms for tensor data. The algorithms use algebraic and geometric structure to give guarantees of optimality. Tensors have a close connection to multivariate distributions in statistics. I obtain the first nontrivial instance of an exact maximum likelihood estimate for a model with hidden variables. I give a numerical algorithm to recover paths from their third order signature tensors, an inverse problem from stochastic analysis. I also give an algorithm to cluster multidimensional data, with structured clusters encoded via algebraic constraints in a tensor. The structure facilitates the interpretation of clusters in a dataset of cancer cell lines.
Main Content
Enter the password to open this PDF file:













