Skip to main content
eScholarship
Open Access Publications from the University of California

Rate Allocation in Distributed Stream Processing Systems

  • Author(s): Drougas, Ioannis
  • Advisor(s): Kalogeraki, Vana
  • et al.
Abstract

In today's world, stream processing systems have become important, as applications like media broadcasting, sensor network monitoring and on-line data analysis increasingly rely on real-time stream processing. At the same time, service overlays that support distributed stream processing applications are increasingly being deployed in wide-area environments. The inherent heterogeneous, dynamic and large-scale nature of these systems makes it difficult to meet the Quality of Service (QoS) requirements of distributed stream processing applications. This has necessitated the investigation of mechanisms that improve their scalability, efficiency and performance. In the first part of this work we consider the problem of composing stream processing applications in a distributed stream processing system. First, we propose a distributed stream processing system that composes stream processing applications dynamically, while meeting their rate demands. Second, we address the load balancing problem for distributed stream processing applications and present a decentralized and adaptive algorithm that allows the composition of distributed stream processing applications on the fly across a large-scale system, while satisfying their QoS demands. The algorithm fairly distributes the load on the resources and adapts dynamically to changes in the resource utilization or the QoS requirements of the applications.

In the second part, we present BARRE (Burst Accommodation through Rate REconfiguration), a method to address the problem of burst accommodation in a distributed stream processing system. Upon the emergence of a burst, BARRE dynamically reserves the resources dispersed across the nodes of a distributed stream processing system, based on the requirements of each application as well as the resources available on each node. Our experimental results over a real distributed stream processing testbed demonstrate the efficiency of our approach.

Main Content
Current View