Fictitious play in zero-sum stochastic games

Date

2022

Editor(s)

Advisor

Supervisor

Co-Advisor

Co-Supervisor

Instructor

BUIR Usage Stats
1
views
38
downloads

Citation Stats

Series

Abstract

We present a novel variant of fictitious play dynamics combining classical fictitiousplay with Q-learning for stochastic games and analyze its convergence properties in two-player zero-sum stochastic games. Our dynamics involves players forming beliefs on the opponent strategyand their own continuation payoff (Q-function), and playing a greedy best response by using theestimated continuation payoffs. Players update their beliefs from observations of opponent actions.A key property of the learning dynamics is that update of the beliefs onQ-functions occurs at aslower timescale than update of the beliefs on strategies. We show that in both the model-based andmodel-free cases (without knowledge of player payoff functions and state transition probabilities),the beliefs on strategies converge to a stationary mixed Nash equilibrium of the zero-sum stochasticgame.

Source Title

SIAM Journal on Control and Optimization

Publisher

Society for Industrial and Applied Mathematics

Course

Other identifiers

Book Title

Degree Discipline

Degree Level

Degree Name

Citation

Published Version (Please cite this version)

Language

English