Ubiquitous availability of growing troves of interesting datasets warrants a rewrite of existing programs, for clusters or for out-of-core versions, to handle larger datasets. While DSM clusters deliver programmability and performance via shared-memory programming and tolerating latencies by prefetching and caching, copious disk space is far more readily available than managing clusters. Irregular applications, however, are challenging to parallelize because the input related data dependences that manifest at runtime require use of speculation for correct parallel execution. By speculating that there are no input related cross iteration dependences, it can be doall parallelized; the absence of dependences is validated before committing the computed results. Latency tolerating mechanisms like prefetching and caching which have traditionally benefited data-parallel applications, can hurt performance of speculation from reading stale values. This thesis seeks to address the task of size oblivious programming of irregular applications for large inputs on DSM clusters.
We first simplify the task of programming very large and irregular data, which may sometimes not fit in the DSM. To this end, we introduce a language, the associated runtime and a compiler for efficient distributed software speculation to parallelize irregular applications. The programming model consists of an easy to use, templated C++ library for writing new vertex-centric programs or the simple large and speculate constructs as extensions to the C/C++ language for existing programs.
In addition we introduce the InfiniMem random-access efficient object data format on disk to enable I/O of arbitrary data objects from disk in at most two logical disk seeks. We also present a simple API for I/O into this data format and the accompanying object-centric library framework to program size oblivious programs to support scale-up on individual machines as a first step. We then leverage InfiniMem on individual machines in the cluster to support distributed size oblivious programs. As a final stage to ease programming, we built an LLVM/Clang-based source-to-source compiler, SpeClang, which instruments the annotated source code with invocations into our libraries and runtime. The runtime comprises of an object-based caching DSM with explicit support for software speculation and transparently performs out-of-core computations using InfiniMem to handle large inputs.
Next, we address the inefficiencies that result from employing traditional latency tolerance mechanisms like prefetching and caching on DSM systems to support speculation: (a) we demonstrate the need for balancing computation and communication for efficient distributed software speculation and provide an adaptive, dynamic solution that leverages prefetch and commit queue lengths to drive this balance. (b) we demonstrate that aggressive caching can hurt speculation performance and present three techniques to decrease communication and cost of misspeculation check and speed up misspeculated recomputations by leveraging the DSM caching and speculation protocols to keep misspeculation rates low.