• About
  • Policies
  • What is openaccess
  • Library
  • Contact
Advanced search
      View Item 
      •   BUIR Home
      • Scholarly Publications
      • Faculty of Engineering
      • Department of Computer Engineering
      • View Item
      •   BUIR Home
      • Scholarly Publications
      • Faculty of Engineering
      • Department of Computer Engineering
      • View Item
      JavaScript is disabled for your browser. Some features of this site may not work without it.

      Auto-parallelizing stateful distributed streaming applications

      Thumbnail
      View / Download
      1.4 Mb
      Author
      Schneider, S.
      Hirzel, M.
      Gedik, Buğra
      Wu, K. -L.
      Date
      2012
      Source Title
      PACT '12 Proceedings of the 21st international conference on Parallel architectures and compilation techniques
      Print ISSN
      1089-795X
      Pages
      53 - 63
      Language
      English
      Type
      Conference Paper
      Item Usage Stats
      147
      views
      131
      downloads
      Abstract
      Streaming applications transform possibly infinite streams of data and often have both high throughput and low latency requirements. They are comprised of operator graphs that produce and consume data tuples. The streaming programming model naturally exposes task and pipeline parallelism, enabling it to exploit parallel systems of all kinds, including large clusters. However, it does not naturally expose data parallelism, which must instead be extracted from streaming applications. This paper presents a compiler and runtime system that automatically extract data parallelism for distributed stream processing. Our approach guarantees safety, even in the presence of stateful, selective, and userdefined operators. When constructing parallel regions, the compiler ensures safety by considering an operator's selectivity, state, partitioning, and dependencies on other operators in the graph. The distributed runtime system ensures that tuples always exit parallel regions in the same order they would without data parallelism, using the most efficient strategy as identified by the compiler. Our experiments using 100 cores across 14 machines show linear scalability for standard parallel regions, and near linear scalability when tuples are shuffled across parallel regions. Copyright © 2012 by the Association for Computing Machinery, Inc. (ACM).
      Keywords
      Automatic parallelization
      Distributed stream processing
      Auto-parallelizing
      Automatic Parallelization
      Data parallelism
      Data tuples
      Distributed streaming
      Efficient strategy
      High throughput
      Large clusters
      Low latency
      Parallel system
      Programming models
      Runtime systems
      Stream processing
      Streaming applications
      Distributed parameter control systems
      Parallel architectures
      Program compilers
      Permalink
      http://hdl.handle.net/11693/28156
      Published Version (Please cite this version)
      http://dx.doi.org/10.1145/2370816.2370826
      Collections
      • Department of Computer Engineering 1398
      Show full item record

      Related items

      Showing items related by title, author, creator and subject.

      • Thumbnail

        Safe data parallelism for general streaming 

        Schneider S.; Hirzel M.; Gedik, B.; Wu, Kun-Lung (Institute of Electrical and Electronics Engineers, 2015)
        Streaming applications process possibly infinite streams of data and often have both high throughput and low latency requirements. They are comprised of operator graphs that produce and consume data tuples. General streaming ...
      • Thumbnail

        Tutorial: Stream processing optimizations 

        Schneider, S.; Hirzel, M.; Gedik, Buğra (ACM, 2013)
        This tutorial starts with a survey of optimizations for streaming applications. The survey is organized as a catalog that introduces uniform terminology and a common categorization of optimizations across disciplines, such ...
      • Thumbnail

        Elastic scaling for data stream processing 

        Gedik, B.; Schneider S.; Hirzel M.; Wu, Kun-Lung (IEEE Computer Society, 2014)
        This article addresses the profitability problem associated with auto-parallelization of general-purpose distributed data stream processing applications. Auto-parallelization involves locating regions in the application's ...

      Browse

      All of BUIRCommunities & CollectionsTitlesAuthorsAdvisorsBy Issue DateKeywordsTypeDepartmentsThis CollectionTitlesAuthorsAdvisorsBy Issue DateKeywordsTypeDepartments

      My Account

      Login

      Statistics

      View Usage StatisticsView Google Analytics Statistics

      Bilkent University

      If you have trouble accessing this page and need to request an alternate format, contact the site administrator. Phone: (312) 290 1771
      Copyright © Bilkent University - Library IT

      Contact Us | Send Feedback | Off-Campus Access | Admin | Privacy