Arslantaş, YükselYüceel, EgeSayın, Muhammed O.2025-02-182025-02-182024-06-18https://hdl.handle.net/11693/116383In this letter, we explore the susceptibility of the independent Q-learning algorithms (a classical and widely used multi-agent reinforcement learning method) to strategic manipulation of sophisticated opponents in normal-form games played repeatedly. We quantify how much strategically sophisticated agents can exploit naive Q-learners if they know the opponents' Q-learning algorithm. To this end, we formulate the strategic actors' interactions as a stochastic game (whose state encompasses Q-function estimates of the Q-learners) as if the Q-learning algorithms are the underlying dynamical system. We also present a quantization-based approximation scheme to tackle the continuum state space and analyze its performance for two competing strategic actors and a single strategic actor both analytically and numerically.EnglishCC BY 4.0 DEED (Attribution 4.0 International)https://creativecommons.org/licenses/by/4.0/deed.trReinforcement learningGame theoryMarkov processesStrategizing against q-learners: a control-theoretical approachArticle10.1109/LCSYS.2024.34162402475-1456