Locality-aware distributed state partitioning for stream processing systems

Date

2016-10

Editor(s)

Advisor

Gedik, Buğra

Supervisor

Co-Advisor

Co-Supervisor

Instructor

BUIR Usage Stats
0
views
9
downloads

Series

Abstract

Today, there are many applications that deal with high-volume data streams. These distributed stream processing applications process data on-the-fly and provide real-time distributed computing for big data. Due to the volume of data they process, some of these applications make use of data parallel nodes. The state management for distributed nodes in these applications is an important task to handle, because of different use cases such as: dealing with node failures, checkpointing, data enrichment, and re-partitioning. Therefore, distributed stream processing applications need a state management mechanism. In this thesis, we present a locality-aware state management mechanism for distributed stream processing applications. The proposed mechanism provides a transparent locality-aware data partitioning and state management system for distributed stream processing applications. The mechanism partitions data while preserving locality and handles state transfer among nodes transparently, in order to adapt to potential changes in the partitioning. In addition to this, it provides operators with a high-performance state management facility that can tackle check-pointing scenarios. The idea is implemented as a pluggable library for the open-source, distributed stream-processing engine, Apache Storm.

Source Title

Publisher

Course

Other identifiers

Book Title

Degree Discipline

Computer Engineering

Degree Level

Master's

Degree Name

MS (Master of Science)

Citation

Published Version (Please cite this version)

Language

English

Type