Browsing by Subject "Stochastic control"
Now showing 1 - 2 of 2
- Results Per Page
- Sort Options
Item Open Access Heterogeneity and strategic sophistication in multi-agent reinforcement learning(2024-08) Arslantaş, YükselDecision-making powered by artificial intelligence (AI) is becoming increasingly prevalent in socio-technical systems such as finance, smart transportation, security, and robotics. Therefore, there is a critical need for developing the theoretical foundation of how multiple AI decision-makers interact with each other and with humans in order to ensure their reliable use in these systems. Since multiple AI decision-makers make decisions autonomously without central coordination, heterogeneity of their algorithms is inevitable. We establish a theoretical framework for the impact of heterogeneity on multi-agent sequential decision-making under uncertainty. First, we examine the potential heterogeneity of independent learning algorithms assuming that opponents play according to some stationary strategy. To this end, we present a broad family of algorithms that encompass widely-studied dynamics such as fictitious play and Q-learning. While existing convergence results only consider homogeneous cases, where each agent uses the same algorithm; we show that they can still converge to equilibrium if they follow any two different members of this algorithm family. This strengthens the predictive power of game-theoretic equilibrium analysis for heterogeneous systems. We then analyze how a strategically sophisticated agent can manipulate independent learning algorithms, revealing the vulnerability of such independent reinforcement learning algorithms. Finally, we demonstrate the practical implications of our findings by implementing our results in stochastic security games, highlighting its potential for real-life applications, and explore the impact of strategic AI in human-AI interactions in cyberphysical systems.Item Open Access Q-Learning for MDPs with general spaces: convergence and near optimality via quantization under weak continuity(Journal of Machine Learning Research, 2023-07-12) Kara, A. D.; Saldı, Naci; Yüksel, S.Reinforcement learning algorithms often require finiteness of state and action spaces in Markov decision processes (MDPs) (also called controlled Markov chains) and various efforts have been made in the literature towards the applicability of such algorithms for continuous state and action spaces. In this paper, we show that under very mild regularity conditions (in particular, involving only weak continuity of the transition kernel of an MDP), Q-learning for standard Borel MDPs via quantization of states and actions (called Quantized Q-Learning) converges to a limit, and further-more this limit satisfies an optimality equation which leads to near optimality with either explicit performance bounds or which are guaranteed to be asymptotically optimal. Our approach builds on (i) viewing quantization as a measurement kernel and thus a quantized MDP as a partially observed Markov decision process (POMDP), (ii) utilizing near optimality and convergence results of Q-learning for POMDPs, and (iii) finally, near-optimality of finite state model approximations for MDPs with weakly continuous kernels which we show to correspond to the fixed point of the constructed POMDP. Thus, our paper presents a very general convergence and approximation result for the applicability of Q-learning for continuous MDPs.