Autopipelining for data stream processing

dc.citation.epage2354en_US
dc.citation.issueNumber12en_US
dc.citation.spage2344en_US
dc.citation.volumeNumber24en_US
dc.contributor.authorTang, Y.en_US
dc.contributor.authorGedik, B.en_US
dc.date.accessioned2016-02-08T09:33:43Z
dc.date.available2016-02-08T09:33:43Z
dc.date.issued2013en_US
dc.departmentDepartment of Computer Engineeringen_US
dc.description.abstractStream processing applications use online analytics to ingest high-rate data sources, process them on-the-fly, and generate live results in a timely manner. The data flow graph representation of these applications facilitates the specification of stream computing tasks with ease, and also lends itself to possible runtime exploitation of parallelization on multicore processors. While the data flow graphs naturally contain a rich set of parallelization opportunities, exploiting them is challenging due to the combinatorial number of possible configurations. Furthermore, the best configuration is dynamic in nature; it can differ across multiple runs of the application, and even during different phases of the same run. In this paper, we propose an autopipelining solution that can take advantage of multicore processors to improve throughput of streaming applications, in an effective and transparent way. The solution is effective in the sense that it provides good utilization of resources by dynamically finding and exploiting sources of pipeline parallelism in streaming applications. It is transparent in the sense that it does not require any hints from the application developers. As a part of our solution, we describe a light-weight runtime profiling scheme to learn resource usage of operators comprising the application, an optimization algorithm to locate best places in the data flow graph to explore additional parallelism, and an adaptive control scheme to find the right level of parallelism. We have implemented our solution in an industrial-strength stream processing system. Our experimental evaluation based on microbenchmarks, synthetic workloads, as well as real-world applications confirms that our design is effective in optimizing the throughput of stream processing applications without requiring any changes to the application code. © 1990-2012 IEEE.en_US
dc.description.provenanceMade available in DSpace on 2016-02-08T09:33:43Z (GMT). No. of bitstreams: 1 bilkent-research-paper.pdf: 70227 bytes, checksum: 26e812c6f5156f83f0e77b261a471b5a (MD5) Previous issue date: 2013en
dc.identifier.doi10.1109/TPDS.2012.333en_US
dc.identifier.issn1045-9219en_US
dc.identifier.urihttp://hdl.handle.net/11693/20712en_US
dc.language.isoEnglishen_US
dc.publisherInstitute of Electrical and Electronics Engineersen_US
dc.relation.isversionofhttp://dx.doi.org/10.1109/TPDS.2012.333en_US
dc.source.titleIEEE Transactions on Parallel and Distributed Systemsen_US
dc.subjectAutopipeliningen_US
dc.subjectParallelizationen_US
dc.subjectStream processingen_US
dc.subjectAdaptive control schemesen_US
dc.subjectExperimental evaluationen_US
dc.subjectOptimization algorithmsen_US
dc.titleAutopipelining for data stream processingen_US
dc.typeArticleen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Autopipelining for data stream processing.pdf
Size:
878.29 KB
Format:
Adobe Portable Document Format
Description:
Full printable version