Many problems in mathematical physics and engineering involve solving linear systems Ax = b which are highly structured. These structured matrices, which typically arise from discretizations of partial differential or integral equations, can be represented compactly through specific algebraic representations. Two such algebraic representations which fall into this category are the fast multipole method (FMM), originally introduced by Greengard and Rokhlin, and Hierarchically Semi-Separable (HSS) . These representations, which exploit the low-rank structure of off-diagonal blocks, have enabled fast solvers (linear time in certain scenarios) and are commonly used practice today.
In this thesis, we provide new advancements which further enhance the performance of these algorithms. The contributions of this thesis are twofold: (i) we present a new fast memory efficient construction algorithm for Hierarchically Semi-Separable (HSS) representations, and (ii) we present a fast matrix-matrix multiply for the fast multipole method (FMM).
The HSS representation takes advantage of the fact that off-diagonal blocks are known to have low rank in order to yield fast solvers. The memory consumption of the HSS representation itself is O(n) if the rank of the off-diagonal blocks is small. If the user is not required to store the matrix A, but instead only provides a functional interface in order to access the elements of the matrix, it is worthwhile to ask for the algorithm which computes the HSS representation to be memory efficient as well. Previous algorithms, have shown the HSS representation can be computed in O(n^2) flops. Randomized algorithms also exist. However, the memory requirements of these algorithms can be excessive, requiring as much as O(n^2) peak workspace memory. We deal with this issue and present an algorithm that requires O(n^{1.5}) peak workspace memory in the worst case, while still requiring only O(n^2) flops.
The HSS Representation assumes off-diagonal blocks which have low rank, but in practice there are many cases for which this criteria is not satisfied, and in fact can be as much as O(√n). For this reason, there is much interest in FMM, as it relaxes this requirement, only demanding that off-diagonal blocks corresponding to well-separated clusters have low rank. However, the structure of the FMM inverse is not known. To better understand this problem, we consider the problem of computing a 1D FMM representation of the matrix-matrix product of two 1D FMM matrices. We show that the product of two standard (3 pt) 1D FMM matrices possesses a slightly modified 5 pt 1D FMM structure, and we provide a linear time (O(n) flop) algorithm for computing this product. Further, this work suggests that the inverse of an FMM matrix is not itself FMM.