Courses > PHD Courses
Top image

 
Home
News & announcements
Courses
Management Team
Conferences
Dutch OR Groups
People
Links
Contact
 

Landelijk Netwerk Mathematische Besliskunde

Course MDP: Markov Decision Processes

Time: Monday 13.15 - 15.00 (September 12 - November 14)
Location: From September 1, 2022, all LNMB courses can again be attended on the Campus Utrecht Science Park. Details about lecture rooms, as well as online facilities for students and lecturers follow upon registration.
Lecturers: Dr. O. Kanavetas (UL), Dr. F.M. Spieksma (UL)

Course description:
(for participants of this course: see the lecturers' website)
The theory of Markov decision processes (MDPs) - also known under the names sequential decision theory, stochastic control or stochastic dynamic programming - studies sequential optimization of stochastic systems by controlling their transition mechanism over time. Each control policy defines a stochastic process and values of objective functions associated with this process. The goal is to select a control policy that optimizes a function of the values generated by the utility functions.
In real life, decisions that are made usually have two types of impact. Firstly, they cost or save resources, such as money or time. Secondly, by influencing the dynamics of the system they have an impact on the future as well. Therefore, the decision with the largest immediate profit may not be good in view of future rewards in many situations. MDPs model this paradigm and can be used to model many important applications in practice. In this course we provide results on the structure and existence of good policies, on methods for the computation of optimal policies, and illustrate them by applications.

Contents of the lectures:
1. Model formulation, policies, optimality criteria, the finite horizon.
2. Average rewards: optimality equation and solution methods.
3. Discounted rewards: optimality equation and solution methods.
4. Structural properties.
5. Applications of MDPs.
6. Further topics in MDPs

Literature:
Lecture notes will be provided.

Prerequisites:
- Elementary knowledge of linear programming (e.g. K.G. Murty, Linear programming, Wiley, 1983).
- Elementary knowledge of probability theory ( e.g. S.M. Ross, A first course in probability, Macmillan, New York, 1976).
- Elementary knowledge of (numerical) analysis (e.g. Banach space; contracting mappings; Newton’s method; Laurent series).

Examination:
Take home problems.

Address of the lecturers:
Dr. O. Kanavetas
Mathematical Institute, Leiden University
P.O. Box 9512, 2300 RA Leiden
Phone: 071 - 5277126 E-mail: o.kanavetas@math.leidenuniv.nl

Dr. F.M. Spieksma
Mathematical Institute, Leiden University
P.O. Box 9512, 2300 RA Leiden
Phone: 071 - 5277128 E-mail: spieksma@math.leidenuniv.nl