Locality-aware distributed state partitioning for stream processing systems

Date

2016-10

Editor(s)

Advisor

Gedik, Buğra

Supervisor

Co-Advisor

Co-Supervisor

Instructor

Source Title

Print ISSN

Electronic ISSN

Publisher

Bilkent University

Volume

Issue

Pages

Language

English

Journal Title

Journal ISSN

Volume Title

Series

Abstract

Today, there are many applications that deal with high-volume data streams. These distributed stream processing applications process data on-the-fly and provide real-time distributed computing for big data. Due to the volume of data they process, some of these applications make use of data parallel nodes. The state management for distributed nodes in these applications is an important task to handle, because of different use cases such as: dealing with node failures, checkpointing, data enrichment, and re-partitioning. Therefore, distributed stream processing applications need a state management mechanism. In this thesis, we present a locality-aware state management mechanism for distributed stream processing applications. The proposed mechanism provides a transparent locality-aware data partitioning and state management system for distributed stream processing applications. The mechanism partitions data while preserving locality and handles state transfer among nodes transparently, in order to adapt to potential changes in the partitioning. In addition to this, it provides operators with a high-performance state management facility that can tackle check-pointing scenarios. The idea is implemented as a pluggable library for the open-source, distributed stream-processing engine, Apache Storm.

Course

Other identifiers

Book Title

Citation

item.page.isversionof