Qureshi, Muhammad AnjumTekin, Cem2019-02-212019-02-2120189781538615010http://hdl.handle.net/11693/50240Date of Conference: 2-5 May 2018We propose a contextual multi-armed bandit (CMAB) model for cross-layer learning in heterogeneous cognitive radio networks (CRNs). We consider the scenario where application adaptive modulation (AAM) is implemented in the physical (PHY) layer for heterogeneous applications in the application (APP) layer, each having dynamic packet error rate (PER) requirement. We consider the bit error rate (BER) constraint as the context to mode selector determined by the PHY layer based on the PER requirement, and propose a learning algorithm that learns the modulation with the highest expected reward online over an unknown dynamic wireless channel without channel state information (CSI), where the reward is taken as the Quality of Service (QoS) provided by the PHY layer to upper layers. We show numerically that the proposed algorithm's expected cumulative loss with respect to an oracle which knows the channel distribution perfectly grows sublinearly in time, and hence, the average loss asymptotically approaches to zero, which in turn yields optimal performance.EnglishAAMBERFeedbackMode selectorNo CSIPHY layerRegretSNROnline cross-layer learning in heterogeneous cognitive radio networks without CSIConference Paper10.1109/SIU.2018.8404793