Contextual combinatorial volatile multi-armed bandits in compact context spaces

buir.advisorTekin, Cem
dc.contributor.authorNika, Andi
dc.date.accessioned2021-08-17T06:36:25Z
dc.date.available2021-08-17T06:36:25Z
dc.date.copyright2021-07
dc.date.issued2021-07
dc.date.submitted2021-08-06
dc.descriptionCataloged from PDF version of article.en_US
dc.descriptionThesis (Master's): Bilkent University, Department of Electrical and Electronics Engineering, İhsan Doğramacı Bilkent University, 2021.en_US
dc.descriptionIncludes bibliographical references (leaves 78-83).en_US
dc.description.abstractWe consider the contextual combinatorial volatile multi-armed bandit (CCV-MAB) problem in compact context spaces, simultaneously taking into consideration all of its individual features, thus providing a general framework for solving a wide range of practical problems. We solve CCV-MAB using two approaches. First, we use the so called adaptive discretization technique which sequentially partitions the context space X into ’regions of similarity’ and stores similar statistics corresponding to such regions. Under monotonicity of the expected reward and mild continuity assumptions, for both the expected reward and the expected base arm outcomes, we propose Adap-tive Contextual Combinatorial Upper Confidence Bound (ACC-UCB), an online learn-ing algorithm that uses adaptive discretization and incurs O˜(T ( ¯ +1)/( ¯ +2)+) regret for any  > 0, where ¯ represents the approximate optimality dimension related to X . This dimension captures both the benignness of the base arm arrivals and the struc-ture of the expected reward. Second, we impose a Gaussian process (GP) structure on the expected base arms outcomes and thus, using the smoothness of the GP posterior, eliminate the need for adaptive discretization. We propose Optimistic Combinatorial Learning and Optimization with Kernel Upper Confidence Bounds (O’CLOK-UCB) which incurs O˜(K√T γ¯T ) regret, where γ¯T is the maximum information gain associ-ated with the set of base arm contexts that appeared in the first T rounds and K here is the maximum cardinality of any feasible super arm over all rounds. For both methods, we provide experimental results which conclude in the superiority of ACC-UCB over the previous state-of-the-art and of O’CLOCK-UCB over ACC-UCB.en_US
dc.description.provenanceSubmitted by Betül Özen (ozen@bilkent.edu.tr) on 2021-08-17T06:36:25Z No. of bitstreams: 1 10411062.pdf: 1292773 bytes, checksum: a22463a4cb44d8ebcb2aaa24abf3a140 (MD5)en
dc.description.provenanceMade available in DSpace on 2021-08-17T06:36:25Z (GMT). No. of bitstreams: 1 10411062.pdf: 1292773 bytes, checksum: a22463a4cb44d8ebcb2aaa24abf3a140 (MD5) Previous issue date: 2021-07en
dc.description.statementofresponsibilityby Andi Nikaen_US
dc.embargo.release2021-12-01
dc.format.extentviii, 83 leaves : illustrations (some color), charts (some color) ; 30 cm.en_US
dc.identifier.itemidB130105
dc.identifier.urihttp://hdl.handle.net/11693/76440
dc.language.isoEnglishen_US
dc.rightsinfo:eu-repo/semantics/openAccessen_US
dc.subjectMulti-armed banditen_US
dc.subjectContextual combinatorial banditen_US
dc.subjectVolatile banditen_US
dc.subjectAdap-tive discretizationen_US
dc.subjectGaussian processesen_US
dc.titleContextual combinatorial volatile multi-armed bandits in compact context spacesen_US
dc.title.alternativeTıkız bağlam uzaylarında bağlamsal birleşimsel değişken çok-kollu hayduten_US
dc.typeThesisen_US
thesis.degree.disciplineElectrical and Electronic Engineering
thesis.degree.grantorBilkent University
thesis.degree.levelMaster's
thesis.degree.nameMS (Master of Science)

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
10411062.pdf
Size:
1.23 MB
Format:
Adobe Portable Document Format
Description:
Full printable version

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.69 KB
Format:
Item-specific license agreed upon to submission
Description: