BUIR logo
Communities & Collections
All of BUIR
  • English
  • Türkçe
Log In
Please note that log in via username/password is only available to Repository staff.
Have you forgotten your password?
  1. Home
  2. Browse by Subject

Browsing by Subject "Regret"

Filter results by typing the first few letters
Now showing 1 - 9 of 9
  • Results Per Page
  • Sort Options
  • Loading...
    Thumbnail Image
    ItemOpen Access
    Adaptive ensemble learning with confidence bounds
    (Institute of Electrical and Electronics Engineers Inc., 2017) Tekin, C.; Yoon, J.; Schaar, M. V. D.
    Extracting actionable intelligence from distributed, heterogeneous, correlated, and high-dimensional data sources requires run-time processing and learning both locally and globally. In the last decade, a large number of meta-learning techniques have been proposed in which local learners make online predictions based on their locally collected data instances, and feed these predictions to an ensemble learner, which fuses them and issues a global prediction. However, most of these works do not provide performance guarantees or, when they do, these guarantees are asymptotic. None of these existing works provide confidence estimates about the issued predictions or rate of learning guarantees for the ensemble learner. In this paper, we provide a systematic ensemble learning method called Hedged Bandits, which comes with both long-run (asymptotic) and short-run (rate of learning) performance guarantees. Moreover, our approach yields performance guarantees with respect to the optimal local prediction strategy, and is also able to adapt its predictions in a data-driven manner. We illustrate the performance of Hedged Bandits in the context of medical informatics and show that it outperforms numerous online and offline ensemble learning methods.
  • Loading...
    Thumbnail Image
    ItemOpen Access
    Aging wireless bandits: regret analysis and order-optimal learning algorithm
    (IEEE, 2021-11-13) Atay, Eray Unsal; Kadota, Igor; Modiano, Eytan
    We consider a single-hop wireless network with sources transmitting time-sensitive information to the destination over multiple unreliable channels. Packets from each source are generated according to a stochastic process with known statistics and the state of each wireless channel (ON/OFF) varies according to a stochastic process with unknown statistics. The reliability of the wireless channels is to be learned through observation. At every time-slot, the learning algorithm selects a single pair (source, channel) and the selected source attempts to transmit its packet via the selected channel. The probability of a successful transmission to the destination depends on the reliability of the selected channel. The goal of the learning algorithm is to minimize the Age-of-Information (AoI) in the network over T time-slots. To analyze its performance, we introduce the notion of AoI-regret, which is the difference between the expected cumulative AoI of the learning algorithm under consideration and the expected cumulative AoI of a genie algorithm that knows the reliability of the channels a priori. The AoI-regret captures the penalty incurred by having to learn the statistics of the channels over the T time-slots. The results are two-fold: first, we consider learning algorithms that employ well-known solutions to the stochastic multi-armed bandit problem (such as ϵ-Greedy, Upper Confidence Bound, and Thompson Sampling) and show that their AoI-regret scales as Θ(log T); second, we develop a novel learning algorithm and show that it has O(1) regret. To the best of our knowledge, this is the first learning algorithm with bounded AoI-regret.
  • Loading...
    Thumbnail Image
    ItemOpen Access
    Divided agency, manipulation, and regret
    (Universitaet Wien * Institut fuer Philosophie, 2024-11-30) Payton, Jonathan D.
    Saba Bazargan-Forward (2022, Authority, Cooperation, and Accountability), conceives of agency as divided into two functions: a deliberative function (deciding what to do) and an executive function (acting on that decision). He claims that these two functions can distributed across multiple agents, and that this has important moral consequences: if you outsource the executive function to me, then the practical reasons you take there to be, for A-ing, are relevant to whether I can permissibly A and to how my A-ing reflects on my character. However, the natural way of understanding the 'divided agency' model --- i.e. that in cases of divided agency the executor literally acts on the deliberator's reasons --- is problematic and doesn't seem to reflect Bazargan-Forward's considered view, while his considered view doesn't seem to support his moral judgments, either about the permissibility of the executor's behaviour or of their character. I suggest an alternative to Bazargan-Forward's 'divided agency' model and consider what moral judgments it supports.
  • Loading...
    Thumbnail Image
    ItemOpen Access
    Jamming bandits-a novel learning method for optimal jamming
    (Institute of Electrical and Electronics Engineers Inc., 2016) Amuru, S.; Tekin, C.; Van Der Schaar, M.; Buehrer, R.M.
    Can an intelligent jammer learn and adapt to unknown environments in an electronic warfare-type scenario? In this paper, we answer this question in the positive, by developing a cognitive jammer that adaptively and optimally disrupts the communication between a victim transmitter-receiver pair. We formalize the problem using a multiarmed bandit framework where the jammer can choose various physical layer parameters such as the signaling scheme, power level and the on-off/pulsing duration in an attempt to obtain power efficient jamming strategies. We first present online learning algorithms to maximize the jamming efficacy against static transmitter-receiver pairs and prove that these algorithms converge to the optimal (in terms of the error rate inflicted at the victim and the energy used) jamming strategy. Even more importantly, we prove that the rate of convergence to the optimal jamming strategy is sublinear, i.e., the learning is fast in comparison to existing reinforcement learning algorithms, which is particularly important in dynamically changing wireless environments. Also, we characterize the performance of the proposed bandit-based learning algorithm against multiple static and adaptive transmitter-receiver pairs.
  • Loading...
    Thumbnail Image
    ItemOpen Access
    Logarithmic regret bound over diffusion based distributed estimation
    (IEEE, 2014) Sayın, Muhammed O.; Vanlı, Nuri Denizcan; Kozat, Süleyman Serdar
    We provide a logarithmic upper-bound on the regret function of the diffusion implementation for the distributed estimation. For certain learning rates, the bound shows guaranteed performance convergence of the distributed least mean square (DLMS) algorithms to the performance of the best estimation generated with hindsight of spatial and temporal data. We use a new cost definition for distributed estimation based on the widely-used statistical performance measures and the corresponding global regret function. Then, for certain learning rates, we provide an upper-bound on the global regret function without any statistical assumptions.
  • Loading...
    Thumbnail Image
    ItemOpen Access
    Multiagent systems: learning, strategic behavior, cooperation, and network formation
    (Elsevier, 2018) Tekin, Cem; Zhang, S.; Xu, J.; Schaar, M. van der; Djurić, P. M.; Richard., C.
    Many applications ranging from crowdsourcing to recommender systems involve informationally decentralized agents repeatedly interacting with each other in order to reach their goals. These networked agents base their decisions on incomplete information, which they gather through interactions with their neighbors or through cooperation, which is often costly. This chapter presents a discussion on decentralized learning algorithms that enable the agents to achieve their goals through repeated interaction. First, we discuss cooperative online learning algorithms that help the agents to discover beneficial connections with each other and exploit these connections to maximize the reward. For this case, we explain the relation between the learning speed, network topology, and cooperation cost. Then, we focus on how informationally decentralized agents form cooperation networks through learning. We explain how learning features prominently in many real-world interactions, and greatly affects the evolution of social networks. Links that otherwise would not have formed may now appear, and a much greater variety of network configurations can be reached. We show that the impact of learning on efficiency and social welfare could be both positive or negative. We also demonstrate the use of the aforementioned methods in popularity prediction, recommender systems, expert selection, and multimedia content aggregation.
  • Loading...
    Thumbnail Image
    ItemOpen Access
    Online cross-layer learning in heterogeneous cognitive radio networks without CSI
    (IEEE, 2018) Qureshi, Muhammad Anjum; Tekin, Cem
    We propose a contextual multi-armed bandit (CMAB) model for cross-layer learning in heterogeneous cognitive radio networks (CRNs). We consider the scenario where application adaptive modulation (AAM) is implemented in the physical (PHY) layer for heterogeneous applications in the application (APP) layer, each having dynamic packet error rate (PER) requirement. We consider the bit error rate (BER) constraint as the context to mode selector determined by the PHY layer based on the PER requirement, and propose a learning algorithm that learns the modulation with the highest expected reward online over an unknown dynamic wireless channel without channel state information (CSI), where the reward is taken as the Quality of Service (QoS) provided by the PHY layer to upper layers. We show numerically that the proposed algorithm's expected cumulative loss with respect to an oracle which knows the channel distribution perfectly grows sublinearly in time, and hence, the average loss asymptotically approaches to zero, which in turn yields optimal performance.
  • Loading...
    Thumbnail Image
    ItemOpen Access
    RELEAF: an algorithm for learning and exploiting relevance
    (Cornell University, 2015-02) Tekin, C.; Schaar, Mihaela van der
    Recommender systems, medical diagnosis, network security, etc., require on-going learning and decision-making in real time. These -- and many others -- represent perfect examples of the opportunities and difficulties presented by Big Data: the available information often arrives from a variety of sources and has diverse features so that learning from all the sources may be valuable but integrating what is learned is subject to the curse of dimensionality. This paper develops and analyzes algorithms that allow efficient learning and decision-making while avoiding the curse of dimensionality. We formalize the information available to the learner/decision-maker at a particular time as a context vector which the learner should consider when taking actions. In general the context vector is very high dimensional, but in many settings, the most relevant information is embedded into only a few relevant dimensions. If these relevant dimensions were known in advance, the problem would be simple -- but they are not. Moreover, the relevant dimensions may be different for different actions. Our algorithm learns the relevant dimensions for each action, and makes decisions based in what it has learned. Formally, we build on the structure of a contextual multi-armed bandit by adding and exploiting a relevance relation. We prove a general regret bound for our algorithm whose time order depends only on the maximum number of relevant dimensions among all the actions, which in the special case where the relevance relation is single-valued (a function), reduces to O~(T2(2√−1)); in the absence of a relevance relation, the best known contextual bandit algorithms achieve regret O~(T(D+1)/(D+2)), where D is the full dimension of the context vector.
  • Loading...
    Thumbnail Image
    ItemOpen Access
    Robust least squares methods under bounded data uncertainties
    (Academic Press, 2015) Vanli, N. D.; Donmez, M. A.; Kozat, S. S.
    We study the problem of estimating an unknown deterministic signal that is observed through an unknown deterministic data matrix under additive noise. In particular, we present a minimax optimization framework to the least squares problems, where the estimator has imperfect data matrix and output vector information. We define the performance of an estimator relative to the performance of the optimal least squares (LS) estimator tuned to the underlying unknown data matrix and output vector, which is defined as the regret of the estimator. We then introduce an efficient robust LS estimation approach that minimizes this regret for the worst possible data matrix and output vector, where we refrain from any structural assumptions on the data. We demonstrate that minimizing this worst-case regret can be cast as a semi-definite programming (SDP) problem. We then consider the regularized and structured LS problems and present novel robust estimation methods by demonstrating that these problems can also be cast as SDP problems. We illustrate the merits of the proposed algorithms with respect to the well-known alternatives in the literature through our simulations.

About the University

  • Academics
  • Research
  • Library
  • Students
  • Stars
  • Moodle
  • WebMail

Using the Library

  • Collections overview
  • Borrow, renew, return
  • Connect from off campus
  • Interlibrary loan
  • Hours
  • Plan
  • Intranet (Staff Only)

Research Tools

  • EndNote
  • Grammarly
  • iThenticate
  • Mango Languages
  • Mendeley
  • Turnitin
  • Show more ..

Contact

  • Bilkent University
  • Main Campus Library
  • Phone: +90(312) 290-1298
  • Email: dspace@bilkent.edu.tr

Bilkent University Library © 2015-2025 BUIR

  • Privacy policy
  • Send Feedback