Show simple item record

dc.contributor.authorGedik, B.en_US
dc.contributor.authorÖzsema, H. G.en_US
dc.contributor.authorÖztürk, Ö.en_US
dc.date.accessioned2018-04-12T10:54:29Z
dc.date.available2018-04-12T10:54:29Z
dc.date.issued2016en_US
dc.identifier.issn0743-7315
dc.identifier.urihttp://hdl.handle.net/11693/36816
dc.description.abstractThere is an ever increasing rate of digital information available in the form of online data streams. In many application domains, high throughput processing of such data is a critical requirement for keeping up with the soaring input rates. Data stream processing is a computational paradigm that aims at addressing this challenge by processing data streams in an on-the-fly manner, in contrast to the more traditional and less efficient store-and-then process approach. In this paper, we study the problem of automatically parallelizing data stream processing applications in order to improve throughput. The parallelization is automatic in the sense that stream programs are written sequentially by the application developers and are parallelized by the system. We adopt the asynchronous data flow model for our work, which is typical in Data Stream Processing Systems (DSPS), where operators often have dynamic selectivity and are stateful. We solve the problem of pipelined fission, in which the original sequential program is parallelized by taking advantage of both pipeline parallelism and data parallelism at the same time. Our pipelined fission solution supports partitioned stateful data parallelism with dynamic selectivity and is designed for shared-memory multi-core machines. We first develop a cost-based formulation that enables us to express pipelined fission as an optimization problem. The bruteforce solution of this problem takes a long time for moderately sized stream programs. Accordingly, we develop a heuristic algorithm that can quickly, but approximately, solve the pipelined fission problem. We provide an extensive evaluation studying the performance of our pipelined fission solution, including simulations as well as experiments with an industrial-strength DSPS. Our results show good scalability for applications that contain sufficient parallelism, as well as close to optimal performance for the heuristic pipelined fission algorithm.en_US
dc.language.isoEnglishen_US
dc.source.titleJournal of Parallel and Distributed Computingen_US
dc.relation.isversionofhttp://dx.doi.org/10.1016/j.jpdc.2016.05.003en_US
dc.subjectAuto-parallelizationen_US
dc.subjectData stream processingen_US
dc.subjectFissionen_US
dc.subjectPipeliningen_US
dc.subjectApplication programsen_US
dc.subjectData communication systemsen_US
dc.subjectData flow analysisen_US
dc.subjectHeuristic algorithmsen_US
dc.subjectOptimizationen_US
dc.subjectProblem solvingen_US
dc.subjectApplication developersen_US
dc.subjectAuto-parallelizationen_US
dc.subjectComputational paradigmen_US
dc.subjectData stream processingen_US
dc.subjectFissionen_US
dc.subjectOptimization problemsen_US
dc.subjectPipeline parallelismsen_US
dc.subjectSequential programsen_US
dc.subjectData handlingen_US
dc.titlePipelined fission for stream programs with dynamic selectivity and partitioned stateen_US
dc.typeArticleen_US
dc.departmentDepartment of Computer Engineeringen_US
dc.citation.spage106en_US
dc.citation.epage120en_US
dc.citation.volumeNumber96en_US
dc.identifier.doi10.1016/j.jpdc.2016.05.003en_US
dc.publisherAcademic Pressen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record