Course ATS: Advanced Topics in Stochastic Operations Research

Time:	Monday 15:15 – 17:00
Period:	17 November 2025 – 15 December 2025 and 19 January 2026 – 16 February 2026
Location:	All LNMB courses take place on the Campus Utrecht Science Park. Room HFG 611, Hans-Freudenthal building, Budapestlaan 6, 3584 CD Utrecht
Lecturers:	Dr. A.V. den Boer (UvA) and Dr. O. Kanavetas (UL)

Course description

In the first part of this course we will study data-driven decision problems: optimization problems for which the relation between decision and outcome is unknown upfront, and thus has to be learned on-the-fly from accumulating data. This type of problems has an intrinsic tension between statistical goals and optimization goals: learning how the system behaves (the statistical goal) is accelerated by experimenting with different actions, while for taking good decisions (the optimization goal), one would like to limit experimentation and instead use estimated optimal decisions. We will study this `exploration-exploitation' trade-off for so-called `multi-armed bandit problems', the paradigmatic framework for dynamic optimization problems with incomplete information.

In the second part of this course we will study Reinforcement learning. Reinforcement learning has evolved into one of the most dynamic research domains within machine learning, artificial intelligence, and neural network research. The core objective in reinforcement learning is twofold: the creation of effective learning algorithms and the attainment of profound insights into the capabilities and limitations of these algorithms. In this segment of the course, our mission is to offer a concise and accessible presentation of pivotal concepts and algorithms in reinforcement learning. We will explore a range of learning challenges, clarify foundational principles, showcase multiple state-of-the-art algorithms, and then engage in comprehensive discussions concerning their properties and constraints.

Literature

First part: Bandit Algorithms by Tor Lattimore and Csaba Szepesvari (online available)
Second part: Algorithms for Reinforcement Learning by Csaba Szepesvari (online available)

Prerequisites

Probability theory and statistics, and some coding skills (Python/Matlab).

Examination

To be determined.

Address of the lecturers

Dr. A.V. den Boer
Faculteit der Natuurwetenschappen, Wiskunde en Informatica
Universiteit van Amsterdam
Postbus 94248, 1090 GE Amsterdam
Phone: 020-5252497
E-mail: A.V.denBoer@uva.nl

Dr. O. Kanavetas
Mathematical Institute, Leiden University
P.O. Box 9512, 2300 RA Leiden
Phone: 071 - 5277126 E-mail: o.kanavetas@math.leidenuniv.nl