Browsing by Author "Qureshi, Muhammad Anjum"
Now showing 1 - 12 of 12
- Results Per Page
- Sort Options
Item Open Access Contextual multi-armed bandits with structured payoffs(2020-09) Qureshi, Muhammad AnjumMulti-Armed Bandit (MAB) problems model sequential decision making under uncertainty. In traditional MAB, the learner selects an arm in each round, and then, observes a random reward from the arm’s unknown reward distribution. In the end, the goal is to maximize the cumulative reward by learning to select optimal arms as much as possible. In the contextual MAB—an extension to MAB—the learner observes a context (side-information) in the beginning of each round, selects an arm, and then, observes a random reward whose distribution depends on both the arriving context and the chosen arm. Another MAB variant, called unimodal MAB, assumes that the expected reward exhibits a unimodal structure over the arms, and tries to locate the arm with the “peak” reward by learning the direction of increase of the expected reward. In this thesis, we consider an extension to unimodal MAB called contextual unimodal MAB, and demonstrate that it is a powerful tool for designing Artificial Intelligence (AI)- enabled radios by utilizing the special structure of the dependence of the reward to contexts and arms of the wireless environment. While AI-enabled radios are expected to enhance the spectral efficiency of 5th generation (5G) millimeter wave (mmWave) networks by learning to optimize network resources, allocating resources over the mmWave band is extremely challenging due to rapidly-varying channel conditions. We consider several resource allocation problems in this thesis under various design possibilities for mmWave radio networks under unknown channel statistics and without any channel state information (CSI) feedback: i) dynamic rate selection for an energy harvesting transmitter, ii) dynamic power allocation for heterogeneous applications, and iii) distributed resource allocation in a multi-user network. All of these problems exhibit structured payoffs which are unimodal functions over partially ordered arms (transmission parameters) as well as unimodal or monotone functions over partially ordered contexts (side-information). Structure over arms helps in reducing the number of arms to be explored, while structure over contexts helps in using past information from nearby contexts to make better selections. We formalize dynamic adaptation of transmission parameters as a structured MAB, and propose frequentist and Bayesian online learning algorithms. We show that both approaches yield logarithmic in time regret. We also investigate dynamic rate and channel adaptation in a cognitive radio network serving heterogeneous applications under dynamically varying channel availability and rate constraints. We formalize the problem as a Bayesian learning problem, and propose a novel learning algorithm which considers each rate-channel pair as a two-dimensional action. The set of available actions varies dynamically over time due to variations in primary user activity and rate requirements of the applications served by the users. Additionally, we extend the work to cater to thescenario when the arms belong to a continuous interval as well as the contexts. Finally, we show via simulations that our algorithms significantly improve the performance in the aforementioned radio resource allocation problems.Item Open Access Decentralized dynamic rate and channel selection over a shared spectrum(IEEE, 2021-03-15) Javanmardi, Alireza; Qureshi, Muhammad Anjum; Tekin, CemWe consider the problem of distributed dynamic rate and channel selection in a multi-user network, in which each user selects a wireless channel and a modulation and coding scheme (corresponds to a transmission rate) in order to maximize the network throughput. We assume that the users are cooperative, however, there is no coordination and communication among them, and the number of users in the system is unknown. We formulate this problem as a multi-player multi-armed bandit problem and propose a decentralized learning algorithm that performs almost optimal exploration of the transmission rates to learn fast. We prove that the regret of our learning algorithm with respect to the optimal allocation increases logarithmically over rounds with a leading term that is logarithmic in the number of transmission rates. Finally, we compare the performance of our learning algorithm with the state-of-the-art via simulations and show that it substantially improves the throughput and minimizes the number of collisions.Item Open Access Expert advice ensemble for thyroid disease diagnosis(IEEE, 2017) Qureshi, Muhammad Anjum; Ekşioğlu, KubilayThyroid gland influences the metabolic processes of human body due to the fact that it produces hormones. Hyperthyroidism in caused due to increase in the production of thyroid hormones. In this paper a methodology using an online ensemble of decision trees to detect thyroid-related diseases is proposed. The aim of this work is to improve the diagnostic accuracy of thyroid disease. Initially, feature rejection method is applied to discard 10 irrelevant and redundant features from 29 features. Then, it's shown that the offline ensemble of decision trees provides higher performance than state-of-the-art methodologies. Afterwards, the exponential weights based online ensemble method is implemented which reaches comparable classification performance with offline methodology. The proposed system consists of three stages: feature rejection, training decision trees with different cost schemes and the online classification stage where each classifier is weighted using an exponential weight based algorithm. The performance of online algorithm increases as the number of samples increases, because it continuously updates the weights to improve accuracy. The achieved classification accuracy proves the robustness and effectiveness of online version of proposed system in thyroid disease diagnosis.Item Open Access Fast learning for dynamic resource allocation in AI-Enabled radio networks(IEEE, 2020) Qureshi, Muhammad Anjum; Tekin, CemArtificial Intelligence (AI)-enabled radios are expected to enhance the spectral efficiency of 5th generation (5G) millimeter wave (mmWave) networks by learning to optimize network resources. However, allocating resources over the mmWave band is extremely challenging due to rapidly-varying channel conditions. We consider several resource allocation problems for mmWave radio networks under unknown channel statistics and without any channel state information (CSI) feedback: i) dynamic rate selection for an energy harvesting transmitter, ii) dynamic power allocation for heterogeneous applications, and iii) distributed resource allocation in a multi-user network. All of these problems exhibit structured payoffs which are unimodal functions over partially ordered arms (transmission parameters) as well as over partially ordered contexts (side-information). Unimodality over arms helps in reducing the number of arms to be explored, while unimodality over contexts helps in using past information from nearby contexts to make better selections. We model this as a structured reinforcement learning problem, called contextual unimodal multi-armed bandit (MAB), and propose an online learning algorithm that exploits unimodality to optimize the resource allocation over time, and prove that it achieves logarithmic in time regret. Our algorithm's regret scales sublinearly both in the number of arms and contexts for a wide range of scenarios. We also show via simulations that our algorithm significantly improves the performance in the aforementioned resource allocation problems.Item Open Access Multi-user small base station association via contextual combinatorial volatile bandits(IEEE, 2021-03-09) Qureshi, Muhammad Anjum; Nika, Andi; Tekin, CemWe propose an efficient mobility management solution to the problem of assigning small base stations (SBSs) to multiple mobile data users in a heterogeneous setting. We formalize the problem using a novel sequential decision-making model named contextual combinatorial volatile multi-armed bandits (MABs), in which each association is considered as an arm, volatility of an arm is imposed by the dynamic arrivals of the users, and context is the additional information linked with the user and the SBS such as user/SBS distance and the transmission frequency. As the next-generation communications are envisioned to take place over highly dynamic links such as the millimeter wave (mmWave) frequency band, we consider the association problem over an unknown channel distribution with a limited feedback in the form of acknowledgments and under the absence of channel state information (CSI). As the links are unknown and dynamically varying, the assignment problem cannot be solved offline. Thus, we propose an online algorithm which is able to solve the user-SBS association problem in a multi-user and time-varying environment, where the number of users dynamically varies over time. Our algorithm strikes the balance between exploration and exploitation and achieves sublinear in time regret with an optimal dependence on the problem structure and the dynamics of user arrivals and departures. In addition, we demonstrate via numerical experiments that our algorithm achieves significant performance gains compared to several benchmark algorithms.Item Open Access Online Bayesian learning for rate selection in millimeter wave cognitive radio networks(Institute of Electrical and Electronics Engineers, 2020) Qureshi, Muhammad Anjum; Tekin, CemWe consider the problem of dynamic rate selection in a cognitive radio network (CRN) over the millimeter wave (mmWave) spectrum. Specifically, we focus on the scenario when the transmit power is time varying as motivated by the following applications: i) an energy harvesting CRN, in which the system solely relies on the harvested energy source, and ii) an underlay CRN, in which a secondary user (SU) restricts its transmission power based on a dynamically changing interference temperature limit (ITL) such that the primary user (PU) remains unharmed. Since the channel quality fluctuates very rapidly in mmWave networks and costly channel state information (CSI) is not that useful, we consider rate adaptation over an mmWave channel as an online stochastic optimization problem, and propose a Thompson Sampling (TS) based Bayesian method. Our method utilizes the unimodality and monotonicity of the throughput with respect to rates and transmit powers and achieves logarithmic in time regret with a leading term that is independent of the number of available rates. Our regret bound holds for any sequence of transmits powers and captures the dependence of the regret on the arrival pattern. We also show via simulations that the performance of the proposed algorithm is superior than the stateof-the-art algorithms, especially when the arrivals are favorable.Item Open Access Online classification with contextual exponential weights for disease diagnostics(IEEE, 2017) Ekşioğlu, Kubilay; Qureshi, Muhammad Anjum; Tekin, CemIn this paper, a novel online scheme for classification, which is based on the contextual-variant of Weighted Average Forecaster Algorithm is proposed. The proposed method adaptively partitions the data space based on contexts, and tradeoffs exploration and exploitation when fusing the predictions of the experts. The proposed algorithm is verified on disease data available in UCI Online Machine Learning Repository. These results prove the robustness, effectiveness and versatility in terms of performance and low computational cost of the proposed system in the field of medical diagnostics.Item Open Access Online cross-layer learning in heterogeneous cognitive radio networks without CSI(IEEE, 2018) Qureshi, Muhammad Anjum; Tekin, CemWe propose a contextual multi-armed bandit (CMAB) model for cross-layer learning in heterogeneous cognitive radio networks (CRNs). We consider the scenario where application adaptive modulation (AAM) is implemented in the physical (PHY) layer for heterogeneous applications in the application (APP) layer, each having dynamic packet error rate (PER) requirement. We consider the bit error rate (BER) constraint as the context to mode selector determined by the PHY layer based on the PER requirement, and propose a learning algorithm that learns the modulation with the highest expected reward online over an unknown dynamic wireless channel without channel state information (CSI), where the reward is taken as the Quality of Service (QoS) provided by the PHY layer to upper layers. We show numerically that the proposed algorithm's expected cumulative loss with respect to an oracle which knows the channel distribution perfectly grows sublinearly in time, and hence, the average loss asymptotically approaches to zero, which in turn yields optimal performance.Item Open Access Online optimization of wireless sensors selection over an unknown stochastic environment(Institute of Electrical and Electronics Engineers, 2018) Qureshi, Muhammad Anjum; Sarmad, Wardah; Noor, Hira; Mirza, Ali HassanWireless communication is considered to be more challenging than the typical wired communication due to unpredictable channel conditions. In this paper, we target coverage area problem, where a group of sensors is selected from a set of sensors placed in a particular area to maximize the coverage provided to that area. The constraints to this optimization are the battery power of the sensor and number of sensors that are active at a given time. We consider a variant of the coverage related to a particular sensor, where coverage is considered to be an unknown stochastic variable, and hence, we need to learn the best subset of sensors in real time. We propose an online combinatorial optimization algorithm based on multi-armed bandits framework that learns the expected best subset of sensors, and the regret of the proposed online algorithm is sub-linear in time. The achieved performance proves the robustness and effectiveness of the proposed online algorithm in wireless sensor selection over an unknown stochastic environment.Item Open Access Prediction, classification and recommendation in e-health via contextual partitioning(IEEE, 2021-07-19) Qureshi, Muhammad AnjumIn this paper, we propose a multipurpose contextual partitioning based estimation algorithm. Exploiting the similarities between contexts (side information: such as age, Gender etc.,) related to patient data in healthcare repository or database, multidimensional spheres are generated over Euclidean space. Then, conditional first and second order characteristics are predicted using sample-based mean and covariance. These conditional statistics of particular patient data subset (sphere) serve the following purposes: i) Prediction for missing values (conditional mean), ii) Partitioned principal components for better classification (conditional covariance) and iii) Recommendation for medical Test or physician (conditional covariance). The proposed approach uniformly partitions the context space into spheres, and then, for each sphere estimates the conditional mean and covariance using only the data (excluding the context data) in the selected sphere. Hence, providing three in one solution i.e., Prediction, Classification and Recommendation for healthcare data using conditional probabilistic characteristics. The overall error is decomposed into estimation and approximation errors. In a particular sphere, estimation error is dependent on the number of instances, while approximation error is dependent on the dissimilarity of instances.Item Open Access Rate and channel adaptation in cognitive radio networks under time-varying constraints(IEEE, 2020) Qureshi, Muhammad Anjum; Tekin, CemWe consider dynamic rate and channel adaptation in a cognitive radio network serving heterogeneous applications under dynamically varying channel availability and rate constraint. We formalize it as a Bayesian learning problem, and propose a novel learning algorithm, called Volatile Constrained Thompson Sampling (V-CoTS), which considers each rate-channel pair as a two-dimensional action. The set of available actions varies dynamically over time due to variations in primary user activity and rate requirements of the applications served by the users. Our algorithm learns to adapt its rate and opportunistically exploit spectrum holes when the channel conditions are unknown and channel state information is absent, by using acknowledgment only feedback. It uses the monotonicity of the transmission success probability in the transmission rate to optimally tradeoff exploration and exploitation of the actions. Numerical results demonstrate that V-CoTS achieves significant gains in throughput compared to the state-of-the-art methods.Item Open Access Reinforcement learning for link adaptation and channel selection in leo satellite cognitive communications(Institute of Electrical and Electronics Engineers, 2023-01-27) Qureshi, Muhammad Anjum; Lagunas, E.; Kaddoum, G.In this letter, we solve the link adaptation and channel selection problem in next generation satellite cognitive networks under dynamically varying channel availability and time-varying channel statistics. Primary user (PU) activity in Low Earth Orbit (LEO) satellite cognitive communications forces the set of available transmission channels for a secondary user (SU) to vary dynamically over time. We consider the scenario where the channel state varies in a piecewise-stationary mode, referred to as quasi-static (block-fading) channels. We formalize the problem as a reinforcement learning problem, and propose Discounted Structured and Sleeping Thompson Sampling (dSTS), which maximizes the SU’s throughput by selecting the optimum modulation and coding scheme (MCS) and the transmission channel under volatile and piecewise-stationary settings. When channel characteristics are unknown as well as piecewise-stationary, the proposed algorithm adapts the SU’s link-rate by exploiting the structure of the transmission success probability in transmission rates over the selected available channel. Furthermore, channel state information (CSI) is absent and feedback is limited to 1-bit (success/failure).