Auto-parallelizing stateful distributed streaming applications
dc.citation.epage | 63 | en_US |
dc.citation.spage | 53 | en_US |
dc.contributor.author | Schneider, S. | en_US |
dc.contributor.author | Hirzel, M. | en_US |
dc.contributor.author | Gedik, Buğra | en_US |
dc.contributor.author | Wu, K. -L. | en_US |
dc.coverage.spatial | Minneapolis, Minnesota, USA | en_US |
dc.date.accessioned | 2016-02-08T12:12:41Z | |
dc.date.available | 2016-02-08T12:12:41Z | |
dc.date.issued | 2012 | en_US |
dc.department | Department of Computer Engineering | en_US |
dc.description | Date of Conference: 19 - 23 September 2012 | en_US |
dc.description.abstract | Streaming applications transform possibly infinite streams of data and often have both high throughput and low latency requirements. They are comprised of operator graphs that produce and consume data tuples. The streaming programming model naturally exposes task and pipeline parallelism, enabling it to exploit parallel systems of all kinds, including large clusters. However, it does not naturally expose data parallelism, which must instead be extracted from streaming applications. This paper presents a compiler and runtime system that automatically extract data parallelism for distributed stream processing. Our approach guarantees safety, even in the presence of stateful, selective, and userdefined operators. When constructing parallel regions, the compiler ensures safety by considering an operator's selectivity, state, partitioning, and dependencies on other operators in the graph. The distributed runtime system ensures that tuples always exit parallel regions in the same order they would without data parallelism, using the most efficient strategy as identified by the compiler. Our experiments using 100 cores across 14 machines show linear scalability for standard parallel regions, and near linear scalability when tuples are shuffled across parallel regions. Copyright © 2012 by the Association for Computing Machinery, Inc. (ACM). | en_US |
dc.description.provenance | Made available in DSpace on 2016-02-08T12:12:41Z (GMT). No. of bitstreams: 1 bilkent-research-paper.pdf: 70227 bytes, checksum: 26e812c6f5156f83f0e77b261a471b5a (MD5) Previous issue date: 2012 | en |
dc.identifier.doi | 10.1145/2370816.2370826 | en_US |
dc.identifier.isbn | 978-1-4503-1182-3 | en_US |
dc.identifier.issn | 1089-795X | en_US |
dc.identifier.uri | http://hdl.handle.net/11693/28156 | en_US |
dc.language.iso | English | en_US |
dc.relation.isversionof | http://dx.doi.org/10.1145/2370816.2370826 | en_US |
dc.source.title | PACT '12 Proceedings of the 21st international conference on Parallel architectures and compilation techniques | en_US |
dc.subject | Automatic parallelization | en_US |
dc.subject | Distributed stream processing | en_US |
dc.subject | Auto-parallelizing | en_US |
dc.subject | Automatic Parallelization | en_US |
dc.subject | Data parallelism | en_US |
dc.subject | Data tuples | en_US |
dc.subject | Distributed streaming | en_US |
dc.subject | Efficient strategy | en_US |
dc.subject | High throughput | en_US |
dc.subject | Large clusters | en_US |
dc.subject | Low latency | en_US |
dc.subject | Parallel system | en_US |
dc.subject | Programming models | en_US |
dc.subject | Runtime systems | en_US |
dc.subject | Stream processing | en_US |
dc.subject | Streaming applications | en_US |
dc.subject | Distributed parameter control systems | en_US |
dc.subject | Parallel architectures | en_US |
dc.subject | Program compilers | en_US |
dc.title | Auto-parallelizing stateful distributed streaming applications | en_US |
dc.type | Conference Paper | en_US |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Auto-parallelizing stateful distributed streaming applications.pdf
- Size:
- 1.38 MB
- Format:
- Adobe Portable Document Format
- Description:
- Full printable version