Big-data streaming applications scheduling based on staged multi-armed bandits
Date
2016Source Title
IEEE Transactions on Computers
Print ISSN
0018-9340
Publisher
Institute of Electrical and Electronics Engineers
Volume
65
Issue
12
Pages
3591 - 3605
Language
English
Type
ArticleItem Usage Stats
266
views
views
276
downloads
downloads
Abstract
Several techniques have been recently proposed to adapt Big-Data streaming applications to existing many core platforms. Among these techniques, online reinforcement learning methods have been proposed that learn how to adapt at run-time the throughput and resources allocated to the various streaming tasks depending on dynamically changing data stream characteristics and the desired applications performance (e.g., accuracy). However, most of state-of-the-art techniques consider only one single stream input in its application model input and assume that the system knows the amount of resources to allocate to each task to achieve a desired performance. To address these limitations, in this paper we propose a new systematic and efficient methodology and associated algorithms for online learning and energy-efficient scheduling of Big-Data streaming applications with multiple streams on many core systems with resource constraints. We formalize the problem of multi-stream scheduling as a staged decision problem in which the performance obtained for various resource allocations is unknown. The proposed scheduling methodology uses a novel class of online adaptive learning techniques which we refer to as staged multi-armed bandits (S-MAB). Our scheduler is able to learn online which processing method to assign to each stream and how to allocate its resources over time in order to maximize the performance on the fly, at run-time, without having access to any offline information. The proposed scheduler, applied on a face detection streaming application and without using any offline information, is able to achieve similar performance compared to an optimal semi-online solution that has full knowledge of the input stream where the differences in throughput, observed quality, resource usage and energy efficiency are less than 1, 0.3, 0.2 and 4 percent respectively.
Keywords
data miningmachine learning
many-core platforms
multiple streams processing
Scheduling
Artificial intelligence
Computer architecture
Data reduction
E-learning
Embedded systems
Energy efficiency
Face recognition
Learning systems
Online systems
Processing
Reinforcement learning
Concept drifts
Energy-Efficient Scheduling
Many core
Multiple streams
Reinforcement learning method
Resource Constraint
State-of-the-art techniques
Streaming applications
Big data
Permalink
http://hdl.handle.net/11693/36508Published Version (Please cite this version)
http://dx.doi.org/10.1109/TC.2016.2550454Collections
Related items
Showing items related by title, author, creator and subject.
-
Adaptive ensemble learning with confidence bounds for personalized diagnosis
Tekin, Cem; Yoon, J.; Van Der Schaar, M. (AAAI Press, 2016)With the advances in the field of medical informatics, automated clinical decision support systems are becoming the de facto standard in personalized diagnosis. In order to establish high accuracy and confidence in ... -
The effect of uncertainty on learning in game-like environments
Ozcelik, E.; Cagiltay, N. E.; Ozcelik, N. S. (Pergamon Press, 2013)Considering the role of games for educational purposes, there has an increase in interest among educators in applying strategies used in popular games to create more engaging learning environments. Learning is more fun and ... -
Weakly supervised object localization with multi-fold multiple instance learning
Cinbis, R. G.; Verbeek, J.; Schmid, C. (IEEE Computer Society, 2017)Object category localization is a challenging problem in computer vision. Standard supervised training requires bounding box annotations of object instances. This time-consuming annotation process is sidestepped in weakly ...