Skip to main content
eScholarship
Open Access Publications from the University of California

UC Irvine

UC Irvine Electronic Theses and Dissertations bannerUC Irvine

Low-Latency MapReduce

Creative Commons 'BY' version 4.0 license
Abstract

We investigate the problem of MapReduce and coded MapReduce. MapReduce is a programming model for the website search engine. It has three phases: Map, Shuffle, and Reduce. Even though it is an extensively used tool for distributed computing, the communication time involved in the shuffle phase can be a bottleneck for the overall system delay. Coded MapReduce was recently proposed that trades replicated computation for shorter communication. In this thesis, we formally define the model and algorithm of MapReduce and coded MapReduce. Moreover, we implement them with identical Map and Reduce functions. The communication time and the overall system time is analyzed theoretically and measured experimentally. Finally, as an example, reverse indexing is implemented under MapReduce and coded MapReduce framework. We analyze and measure the delay performance for this case as well. We find that when each Map computation task is replicated twice, the coded MapReduce scheme is almost twice as fast as the naive MapReduce scheme.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View