| LNMB

Conference 2023

Rayadurgam Srikant:
Part I: Introduction to Average-Cost MDPs
Part II: Approximate Policy Iteration in Average-Cost MDPs

Abstract: In the first part, we will present an introduction to discounted and average-cost MDPs. Discounted-cost MDPs are more commonly studied in the reinforcement learning and approximate dynamic programming literature. We will present some reasons why average-cost MDPs are harder to study in this context. In the second part, we will present some recent results on approximate dynamic programming and reinforcement learning for average-cost MDPs. Specifically, we will present some new results on approximate policy iteration and soft policy iteration.

This is joint work with Yashaswini Murthy and Mehrdad Moharrami.