CAPSULE: Language and system support for efficient state sharing in distributed stream processing systems
Author
Losa, G.
Kumar, V.
Andrade, H.
Gedik, Buğra
Hirzel, M.
Soulé, R.
Wu, K. -L.
Date
2012Source Title
DEBS '12 Proceedings of the 6th ACM International Conference on Distributed Event-Based Systems
Publisher
ACM
Pages
268 - 277
Language
English
Type
Conference PaperItem Usage Stats
119
views
views
100
downloads
downloads
Abstract
Data stream processing applications are often expressed as data flow graphs, composed of operators connected via streams. This structured representation provides a simple yet powerful paradigm for building large-scale, distributed, high-performance applications. However, there are many tasks that require sharing data across operators, and across operators and the runtime using a less structured mechanism than point-to-point data flows. Examples include updating control variables, sending notifications, collecting metrics, building collective models, etc. In this paper we describe CAPSULE, which fills this gap. CAPSULE is a code generation and runtime framework that offers an easy to use and highly flexible framework for developers to realize shared variables (CAPSULE term for shared state) by specifying a data structure (at the programming-language level), and a few associated configuration parameters that qualify the expected usage scenario. Besides the easy of use and flexibility, CAPSULE offers the following important benefits: (1) Custom Code Generation - CAPSULE makes use of user-specified configuration parameters and information from the runtime to generate shared variable servers that are tailored for the specific usage scenario, (2) Composability - CAPSULE supports deployment time composition of the shared variable servers to achieve desired levels of scalability, performance and fault-tolerance, and (3) Extensibility - CAPSULE provides simple interfaces for extending the CAPSULE framework with more protocols, transports, caching mechanisms, etc. We describe the motivation for CAPSULE and its design, report on its implementation status, and then present experimental results. Copyright © 2012 ACM.
Keywords
Consistency modelsDistributed shared state
Stream processing
Caching mechanism
Code Generation
Composability
Configuration parameters
Consistency model
Control variable
Data flow
Data stream processing
Deployment time
Distributed shared state
Flexible framework
High performance applications
Runtimes
Shared variables
Stream processing
Stream processing systems
System supports
Usage scenarios
Data flow analysis
Data flow graphs
Data structures
Fault tolerance
Network components
Software architecture
Distributed parameter control systems
Permalink
http://hdl.handle.net/11693/28172Published Version (Please cite this version)
http://dx.doi.org/10.1145/2335484.2335514Collections
Related items
Showing items related by title, author, creator and subject.
-
Auto-parallelizing stateful distributed streaming applications
Schneider, S.; Hirzel, M.; Gedik, Buğra; Wu, K. -L. (2012)Streaming applications transform possibly infinite streams of data and often have both high throughput and low latency requirements. They are comprised of operator graphs that produce and consume data tuples. The streaming ... -
Generic windowing support for extensible stream processing systems
Gedik, B. (John Wiley & Sons Ltd., 2014)Stream processing applications process high volume, continuous feeds from live data sources, employ data-in-motion analytics to analyze these feeds, and produce near real-time insights with low latency. One of the fundamental ... -
Big-data streaming applications scheduling based on staged multi-armed bandits
Kanoun, K.; Tekin, C.; Atienza, D.; Van Der Schaar, M. (Institute of Electrical and Electronics Engineers, 2016)Several techniques have been recently proposed to adapt Big-Data streaming applications to existing many core platforms. Among these techniques, online reinforcement learning methods have been proposed that learn how to ...