Browsing by Subject "Gaussian processes"

Now showing 1 - 3 of 3

Open Access
Contextual combinatorial volatile multi-armed bandits in compact context spaces
(2021-07) Nika, Andi
We consider the contextual combinatorial volatile multi-armed bandit (CCV-MAB) problem in compact context spaces, simultaneously taking into consideration all of its individual features, thus providing a general framework for solving a wide range of practical problems. We solve CCV-MAB using two approaches. First, we use the so called adaptive discretization technique which sequentially partitions the context space X into ’regions of similarity’ and stores similar statistics corresponding to such regions. Under monotonicity of the expected reward and mild continuity assumptions, for both the expected reward and the expected base arm outcomes, we propose Adap-tive Contextual Combinatorial Upper Confidence Bound (ACC-UCB), an online learn-ing algorithm that uses adaptive discretization and incurs O˜(T ( ¯ +1)/( ¯ +2)+) regret for any  > 0, where ¯ represents the approximate optimality dimension related to X . This dimension captures both the benignness of the base arm arrivals and the struc-ture of the expected reward. Second, we impose a Gaussian process (GP) structure on the expected base arms outcomes and thus, using the smoothness of the GP posterior, eliminate the need for adaptive discretization. We propose Optimistic Combinatorial Learning and Optimization with Kernel Upper Confidence Bounds (O’CLOK-UCB) which incurs O˜(K√T γ¯T ) regret, where γ¯T is the maximum information gain associ-ated with the set of base arm contexts that appeared in the first T rounds and K here is the maximum cardinality of any feasible super arm over all rounds. For both methods, we provide experimental results which conclude in the superiority of ACC-UCB over the previous state-of-the-art and of O’CLOCK-UCB over ACC-UCB.
Open Access
Diabetes management VIA gaussian process bandits
(2021-10) Çelik, Ahmet Alparslan
Management of chronic diseases such as diabetes mellitus requires adaptation of treatment regimes based on patient characteristics and response. There is no single treatment that ﬁts all patients in all contexts; moreover, the set of admissible treatments usually varies over the course of the disease. In this thesis, we address the problem of optimizing treatment regimes under time-varying constraints by using volatile contextual Gaussian process bandits. In particular, we propose a variant of GP-UCB with volatile arms, which takes into account the patient’s context together with the set of admissible treatments when recommending new treatments. Our Bayesian approach is able to provide treatment recommendations to the patients along with conﬁdence scores which can be used for risk assessment. We use our algorithm to recommend bolus insulin doses for type 1 diabetes mellitus patients. We test our algorithm on in-silico subjects that come with open source implementation of the FDA-approved UVa/Padova type 1 diabetes mellitus simulator. We also compare its performance against a clinician. Moreover, we present a pilot study with a few clinicians and patients, where we design interfaces that they can interact with the model. Meanwhile, we address issues regarding privacy, safety, and ethics. Simulation studies show that our algorithm compares favorably with traditional blood glucose regulation methods.
Open Access
Driver modeling using a continuous policy space: theory and traffic data validation
(Institute of Electrical and Electronics Engineers, 2023-11-16) Yaldiz, C. O.; Yıldız, Yıldıray
In this article, we present a continuous-policy-space game theoretical method for modeling human driver interactions on highway traffic. The proposed method is based on Gaussian Processes and developed as a refinement of the hierarchical decision-making concept called “level- k reasoning” that conventionally assigns discrete levels of behaviors to agents. Conventional level- k reasoning approach may pose undesired constraints for predicting human decision making due to a limited number (usually 2 or 3) of driver policies it provides. To fill this gap in the literature, we expand the framework to a continuous domain that enables a continuous-policy-space, consisting of infinitely many driver policies. Through the approach detailed in this article, more accurate and realistic driver models can be obtained and employed for creating high-fidelity simulation platforms for the validation of autonomous vehicle control algorithms. We validate the proposed method on a traffic dataset and compare it with the conventional level- k approach to demonstrate its contributions and implications.