IBM streams processing language: analyzing big data in motion

Date
2013-05-17
Authors
Hirzel M.
Andrade, H.
Gedik, B.
Jacques-Silva, R.
Khandekar, R.
Kumar, V.
Mendell, M.
Nasgaard, H.
Schneider S.
Soule´, R.
Editor(s)
Advisor
Supervisor
Co-Advisor
Co-Supervisor
Instructor
Source Title
IBM Journal of Research and Development
Print ISSN
0018-8646
Electronic ISSN
0018-8646
Publisher
I B M Corp.
Volume
57
Issue
3-4
Pages
7:1 - 7:11
Language
English
Journal Title
Journal ISSN
Volume Title
Series
Abstract

The IBM Streams Processing Language (SPL) is the programming language for IBM InfoSphere® Streams, a platform for analyzing Big Data in motion. By “Big Data in motion,” we mean continuous data streams at high data-transfer rates. InfoSphere Streams processes such data with both high throughput and short response times. To meet these performance demands, it deploys each application on a cluster of commodity servers. SPL abstracts away the complexity of the distributed system, instead exposing a simple graph-of-operators view to the user. SPL has several innovations relative to prior streaming languages. For performance and code reuse, SPL provides a code-generation interface to C++ and Java®. To facilitate writing well-structured and concise applications, SPL provides higher-order composite operators that modularize stream sub-graphs. Finally, to enable static checking while exposing optimization opportunities, SPL provides a strong type system and user-defined operator models. This paper provides a language overview, describes the implementation including optimizations such as fusion, and explains the rationale behind the language design.

Course
Other identifiers
Book Title
Citation
Published Version (Please cite this version)