Browsing by Subject "Contextual multi-armed bandits"

Now showing 1 - 3 of 3

Open Access
Context-aware hierarchical online learning for performance maximization in mobile crowdsourcing
(Institute of Electrical and Electronics Engineers, 2018) Muller, S. K.; Tekin, Cem; Schaar, M.; Klein, A.
In mobile crowdsourcing (MCS), mobile users accomplish outsourced human intelligence tasks. MCS requires an appropriate task assignment strategy, since different workers may have different performance in terms of acceptance rate and quality. Task assignment is challenging, since a worker's performance 1) may fluctuate, depending on both the worker's current personal context and the task context and 2) is not known a priori, but has to be learned over time. Moreover, learning context-specific worker performance requires access to context information, which may not be available at a central entity due to communication overhead or privacy concerns. In addition, evaluating worker performance might require costly quality assessments. In this paper, we propose a context-aware hierarchical online learning algorithm addressing the problem of performance maximization in MCS. In our algorithm, a local controller (LC) in the mobile device of a worker regularly observes the worker's context, her/his decisions to accept or decline tasks and the quality in completing tasks. Based on these observations, the LC regularly estimates the worker's context-specific performance. The mobile crowdsourcing platform (MCSP) then selects workers based on performance estimates received from the LCs. This hierarchical approach enables the LCs to learn context-specific worker performance and it enables the MCSP to select suitable workers. In addition, our algorithm preserves worker context locally, and it keeps the number of required quality assessments low. We prove that our algorithm converges to the optimal task assignment strategy. Moreover, the algorithm outperforms simpler task assignment strategies in experiments based on synthetic and real data.
Open Access
Online context-aware task assignment in mobile crowdsourcing via adaptive discretization
(IEEE, 2022-09-22) Elahi, Sepehr; Nika, Andi; Tekin, Cem
Mobile crowdsourcing is rapidly boosting the Internet of Things revolution. Its natural development leads to an adaptation to various real-world scenarios, thus imposing a need for wide generality on data-processing and task-assigning methods. We consider the task assignment problem in mobile crowdsourcing while taking into consideration the following: (i) we assume that additional information is available for both tasks and workers, such as location, device parameters, or task parameters, and make use of such information; (ii) as an important consequence of the worker-location factor, we assume that some workers may not be available for selection at given times; (iii) the workers' characteristics may change over time. To solve the task assignment problem in this setting, we propose Adaptive Optimistic Matching for Mobile Crowdsourcing (AOM-MC), an online learning algorithm that incurs O~(T(D¯+1)/(D¯+2)+ϵ) regret in T rounds, for any ϵ>0 , under mild continuity assumptions. Here, D¯ is a notion of dimensionality which captures the structure of the problem. We also present extensive simulations that illustrate the advantage of adaptive discretization when compared with uniform discretization, and a time- and location-dependent crowdsourcing simulation using a real-world dataset, clearly demonstrating our algorithm's superiority to the current state-of-the-art and baseline algorithms.
Open Access
Personalizing treatments via contextual multi-armed bandits by identifying relevance
(2019-08) Bulucu, Cem
Personalized medicine offers specialized treatment options for individuals which is vital as every patient is different. One-size-fits-all approaches are often not effective and most patients require personalized care when dealing with various diseases like cancer, heart diseases or diabetes. As vast amounts of data became available in medicine (and otherfields including web-based recommender systems and intelligent radio networks), online learning approaches are gaining popularity due to their ability to learn fast in uncertain environments. Contextual multi-armed bandit algorithms provide reliable sequential decision-making options in such applications. In medical settings (also in other aforementioned settings), data (contexts) and actions (arms) are often high-dimensional and performances of traditional contextual multi-armed bandit approaches are almost as bad as random selection, due to the curse of dimensionality. Fortunately, in many cases the information relevant to the decision-making task does not depend on all dimensions but rather depends on a small subset of dimensions, called the relevant dimensions. In this thesis, we aim to provide personalized treatments for patients sequentially arriving over time by using contextual multi-armed bandit approaches when the expected rewards related to patient outcomes only vary on a small subset of context and arm dimensions. For this purpose,first we make use of the contextual multi-armed bandit with relevance learning (CMAB-RL) algorithm which learns the relevance by employing a novel partitioning strategy on the context-arm space and forming a set of candidate relevant dimension tuples. In this model, the set of relevant patient traits are allowed to be different for different bolus insulin dosages. Next, we consider an environment where the expected reward function defined over the context-arm space is sampled from a Gaussian process. For this setting, we propose an extension to the contextual Gaussian process upper confidence bound (CGP-UCB) algorithm, called CGP-UCB with relevance learning (CGP-UCB-RL), that learns the relevance by integrating kernels that allow weights to be associated with each dimension and optimizing the negative log marginal likelihood. Then, we investigate the suitability of this approach in the blood glucose regulation problem. Aside from applying both algorithms to the bolus insulin administration problem, we also evaluate their performance in synthetically generated environments as benchmarks.