Course ATS: Advanced Topics in Stochastic Operations Research: Multi-armed bandit theory and applications

Time: Monday 15.15 – 17.00 (November 18 – December 16 and January 20 – February 17).
Location: Hans Freudenthalgebouw, Room 611AB, Budapestlaan, Utrecht (De Uithof).
Lecturer: Dr. A.V. den Boer (UvA)

Course description:
In this course we will study data-driven decision problems: optimization problems for which the relation between decision and outcome is unknown upfront, and thus has to be learned on-the-fly from accumulating data. This type of problems has an intrinsic tension between statistical goals and optimization goals: learning how the system behaves (the statistical goal) is accelerated by experimenting with different actions, while for taking good decisions (the optimization goal), one would like to limit experimentation and instead use estimated optimal decisions. We will study this `exploration-exploitation' trade-off for so-called `multi-armed bandit problems', the paradigmatic framework for dynamic optimization problems with incomplete information. We will discuss standard building blocks of the theory, and focus on applications in operations research such as dynamic pricing and assortment optimization problems.

Will be provided during the course, a.o. Bandit Algorithms by Tor Lattimore and Csaba Szepesvari (online available)

Probability theory and statistics, and some coding skills (Python/Matlab).

Take home problems.

Address of the lecturer:
Dr. A.V. den Boer
Faculteit der Natuurwetenschappen, Wiskunde en Informatica
Universiteit van Amsterdam
Postbus 94248, 1090 GE Amsterdam
Phone: 020-5252497 E-mail: