Reward-rate maximization in sequential identification under a stochastic deadline

dc.citation.epage2948en_US
dc.citation.issueNumber4en_US
dc.citation.spage2922en_US
dc.citation.volumeNumber51en_US
dc.contributor.authorDayanık, S.en_US
dc.contributor.authorYu, A. J.en_US
dc.date.accessioned2016-02-08T11:03:43Z
dc.date.available2016-02-08T11:03:43Z
dc.date.issued2013en_US
dc.departmentDepartment of Industrial Engineeringen_US
dc.departmentDepartment of Mathematicsen_US
dc.description.abstractAny intelligent system performing evidence-based decision making under time pressure must negotiate a speed-accuracy trade-off. In computer science and engineering, this is typically modeled as minimizing a Bayes-risk functional that is a linear combination of expected decision delay and expected terminal decision loss. In neuroscience and psychology, however, it is often modeled as maximizing the long-term reward rate, or the ratio of expected terminal reward and expected decision delay. The two approaches have opposing advantages and disadvantages. While Bayes-risk minimization can be solved with powerful dynamic programming techniques unlike reward-rate maximization, it also requires the explicit specification of the relative costs of decision delay and error, which is obviated by reward-rate maximization. Here, we demonstrate that, for a large class of sequential multihypothesis identification problems under a stochastic deadline, the reward-rate maximization is equivalent to a special case of Bayes-risk minimization, in which the optimal policy that attains the minimal risk when the unit sampling cost is exactly the maximal reward rate is also the policy that attains maximal reward rate. We show that the maximum reward rate is the unique unit sampling cost for which the expected total observation cost and expected terminal reward break even under every Bayes-risk optimal decision rule. This interplay between reward-rate maximization and Bayesrisk minimization formulations allows us to show that maximum reward rate is always attained. We can compute the policy that maximizes reward rate by solving an inverse Bayes-risk minimization problem, whereby we know the Bayes risk of the optimal policy and need to find the associated unit sampling cost parameter. Leveraging this equivalence, we derive an iterative dynamic programming procedure for solving the reward-rate maximization problem exponentially fast, thus incorporating the advantages of both the reward-rate maximization and Bayes-risk minimization formulations. As an illustration, we will apply the procedure to a two-hypothesis identification example.en_US
dc.description.provenanceMade available in DSpace on 2016-02-08T11:03:43Z (GMT). No. of bitstreams: 1 bilkent-research-paper.pdf: 70227 bytes, checksum: 26e812c6f5156f83f0e77b261a471b5a (MD5) Previous issue date: 2013en
dc.identifier.doi10.1137/100818005en_US
dc.identifier.eissn1095-7138
dc.identifier.issn0363-0129
dc.identifier.urihttp://hdl.handle.net/11693/26706
dc.language.isoEnglishen_US
dc.relation.isversionofhttp://dx.doi.org/10.1137/100818005en_US
dc.source.titleSIAM Journal on Control and Optimizationen_US
dc.subjectBayes-risk minimizationen_US
dc.subjectDynamic programmingen_US
dc.subjectReward-rate maximizationen_US
dc.subjectSequential multihypothesis testingen_US
dc.subjectSpeed-accuracy trade offen_US
dc.subjectBayes-risk minimizationen_US
dc.subjectComputer science and engineeringsen_US
dc.subjectDynamic programming techniquesen_US
dc.subjectIdentification problemen_US
dc.subjectIterative Dynamic Programmingen_US
dc.subjectMulti-hypothesis testingen_US
dc.subjectOptimal decision-ruleen_US
dc.subjectTrade offen_US
dc.subjectCostsen_US
dc.subjectDynamic programmingen_US
dc.subjectEconomic and social effectsen_US
dc.subjectEquivalence classesen_US
dc.subjectIntelligent systemsen_US
dc.subjectInverse problemsen_US
dc.subjectIterative methodsen_US
dc.subjectOptimizationen_US
dc.subjectStochastic systemsen_US
dc.subjectDecision makingen_US
dc.titleReward-rate maximization in sequential identification under a stochastic deadlineen_US
dc.typeArticleen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Reward-rate_maximization_in_sequential_identification_under_a_stochastic_deadline.pdf
Size:
2.82 MB
Format:
Adobe Portable Document Format
Description:
Full printable version