Anne Zander

Personal website

Dr. Anne Zander is an Assistant Professor in the Stochastic Operations Research group at the Department of Applied Mathematics at the University of Twente and a member of CHOIR (Center for Healthcare Operations Improvement & Research). Her research focuses on Sequential Decision-Making, applying methods such as Stochastic Programming and Reinforcement Learning to healthcare logistics challenges. She is involved in several national (ZonMw) and international (European Horizon, Interreg) research projects related to capacity allocation and patient steering, e.g., in a cross-border context or during infectious outbreaks. In addition, she co-initiated a Strategic Research Initiative within 4TU. AMI (joint initiative of the mathematics departments of the four technical universities in the Netherlands) to set up a Dutch mathematical community for Sequential Decision-Making. In 2021, Dr. Zander earned her PhD from the Karlsruhe Institute of Technology, Germany, where she also completed her studies in Mathematics.

Lecture 1 Lecture 2

Stella Kapodistria

Personal website

Stella Kapodistria is Associate Professor at Eindhoven University of Technology (TU/e) leading the Data-Driven Stochastic Operations Research group. She specializes in the design of algorithms for large-scale, data-driven systems, with a focus on reinforcement learning and Markov Decision Processes. Her research spans both fundamental and applied domains, aiming to improve system and network performance through real-time decision-making. She has collaborated closely with industrial partners (ASML, NS, Philips, Fokker, etc), which has shaped her vision for impactful and application-oriented research. Through her interdisciplinary, collaborative approach, Stella connects academic innovation with societal and industrial relevance.

Stella has a strong record in interdisciplinary collaboration and research leadership. She is the scientific director of the Applied Mathematics Institute of 4TU and she was instrumental in establishing the 4TU Resilience Engineering center. She has co-led major national research consortia (NWA-ORC, NWO Big Data, TKI WoZ, etc). Her academic contributions are further reflected in her service to the community through editorial roles (e.g., MCAP, PEIS, ANOR) and technical program committees of flagship conferences such as Sigmetrics and Performance.

Mentorship is a central element of Stella’s academic life. She has supervised numerous students across all levels, with her PhD mentees receiving prestigious awards such as the Willem R. van Zwet Award and the Beta PhD Award (C. Drent 2022) and the World Class Maintenance MSc thesis award (P. Verlijsdonk 2020; C. Suijkerbuijk 2017).

Lecture 1 Lecture 2

Fenghui Yu

Personal website

Fenghui Yu is an Assistant Professor in Stochastic and Mathematical Finance at TU Delft, where she has been since 2022. Prior to that, she was a postdoctoral researcher at RiskLab and the Department of Mathematics at ETH Zürich. She obtained her PhD in Financial Mathematics from the University of Hong Kong. Her research interests lie broadly in stochastic modeling, optimal control, and risk management, with applications in finance. Specific areas of recent interest include algorithmic trading, market microstructure, and data-driven approaches, such as reinforcement learning, for sequential decision-making in financial systems.

Lecture 1 Lecture 2

Xiaodong Cheng

Personal website

I am an assistant professor at the Mathematical and Statistical Methods (Biometris), Wageningen University & Research (WUR). My main research interests cover various topics in control systems, optimization and machine learning. I obtained my Ph.D. degree with honors (cum laude) from the University of Groningen, the Netherlands, and before joining WUR, I was appointed as a research associate in the Department of Engineering at the University of Cambridge from 2020 to 2022 and a postdoctoral researcher in the Department of Electrical Engineering at the Eindhoven University of Technology from 2019 to 2020. I am the recipient of the Paper Prize Award from the IFAC Journal Automatica in the triennium 2017–2019 and the Outstanding Paper Award from IEEE Transactions on Control Systems Technology in 2020.

Lecture 1 Lecture 2

George van Voorn

Citations

My work involves the development and application of simulation models in the life sciences. My particular focus is on the interface between human actors as autonomous decision makers and their environment, mainly applied in the agri-food and ecological domains. This is relevant for multiple current societal challenges, such as the resilience of food systems against climate change or market fluctuations, or the ability to transform food systems. I work on different modelling methodologies, including agent-based modelling, dynamic equation modelling, and hybrid machine learning.

In the context of this SRI, I will be working on the explicit codification of rules on individual interactions and decision making in agent-based models, and how stochasticity and uncertainty at the individual level translate to stochasticity and uncertainty at higher organizational levels like agricultural communities and food supply chains.

Lecture 1 Lecture 2

Abstracts

Anne Zander

Lecture 1: The unified framework for sequential decisions

In this lecture, the students will learn how real-world sequential decision problems can be modeled using the unified framework for sequential decisions. Subsequently, we cover the four main methods, referred to as meta-policies, for solving sequential decision problems, where we establish connections between the types of problems and the most suitable methods.

Background information

Lecture 2: Policy evaluation and tuning

In this second lecture, we will design and implement a simple policy to solve a real-world sequential decision problem motivated by managing an inventory. Using this example, the students will learn how to evaluate and tune a given policy.

Stella Kapodistria

Lecture 1: Optimal Decision-Making under Parameter Uncertainty: A POMDP Approach

In this session, we explore a class of Markov Decision Processes (MDPs) where the underlying stochastic model is only partially known—commonly referred to in the literature as Partially Observable Markov Decision Processes (POMDPs). Specifically, we consider a stylized stochastic process (e.g., a compound Poisson process), where the system's evolution is fully observable, but the process parameters must be learned over time. As data is collected, it serves two simultaneous purposes: 1) To estimate the unknown model parameters, and 2) To compute the optimal decision policy for the MDP. We demonstrate how this setting can be naturally framed as a POMDP, enabling a unified approach to learning and decision-making. By collapsing the information state space to the efficient statistic representation, we make the computation and analysis of optimal policies tractable.

We show how to prove structural properties of the optimal policy. We begin with the case of a one-dimensional stochastic process (e.g., a compound Poisson process) and then extend the approach to multi-dimensional processes, a network. As computational complexity increases, we leverage approximation techniques from Deep Reinforcement Learning to address the curse of dimensionality inherent in such problems.

Lecture 2: Optimal Decision-Making under Parameter Uncertainty in Practice

The practical significance of the POMDP framework presented in the theoretical session lies in its application to maintenance decision-making. In maintenance settings, costly and unexpected failures can be mitigated through preventive replacements (instead of corrective replacements) based on real-time degradation signals and associated cost structures. The POMDP formulation provides a principled way to determine the cost-optimal timing for such replacements. In this second part, we begin by presenting numerical results that highlight the competitiveness of our algorithm. We demonstrate how the integration of learning and decision-making can yield substantial performance gains. For multi-dimensional systems, we develop a deep reinforcement learning algorithm informed by structural properties of the problem, allowing us to efficiently solve high-dimensional POMDPs. We illustrate the real-world relevance of our approach through a comprehensive case study on X-ray systems, where our method leads to significant reductions in maintenance costs.

By the end of these two sessions, students will gain a clear understanding of how to integrate real-time data, probabilistic modeling, and learning algorithms to develop effective and scalable MDP-based solutions for complex networks.

Fenghui Yu

Lecture 1: Foundations of Stochastic Optimal Control and Connections to Reinforcement Learning

This lecture introduces core concepts of stochastic optimal control, with a focus on the dynamic programming principle and the associated Hamilton-Jacobi-Bellman (HJB) equation. We will then explore how these concepts connect to reinforcement learning, and how both frameworks can be applied to sequential decision-making problems in finance.

Lecture 2: Applications in Finance – From Stochastic Control to Algorithmic Trading

This session builds on the theoretical groundwork of the first lecture, and focuses on applications in quantitative finance. Through examples from algorithmic trading, we will illustrate how HJB-based methods and data-driven control approaches can be used to solve real-world financial decision problems.

Xiaodong Cheng

Lecture 1: Introduction to Control Theory and Model Predictive Control

This lecture will provide an introduction to control theory and Model Predictive Control (MPC) with a practical application to greenhouse climate control. The session is divided into two parts: Part 1 begins with an overview of control theory, introducing key concepts such as feedback, stability, and system dynamics, using a greenhouse as an illustrative example to contextualize these principles. This will be followed by an introduction to optimal control, focusing on the formulation and objectives of optimizing system performance, from there, we will look into MPC, starting with linear systems to explain the core methodology, including prediction models, cost functions, and constraints. A brief discussion on nonlinear MPC will highlight its relevance and challenges without excessive technical detail, ensuring accessibility for students new to the topic.

Lecture 2: Applications in Agriculture — Indoor Climate Control in Greenhouse

Part 2 is about greenhouse control problem with hands-on practice. The control objectives and constraints are defined, followed by a step-by-step formulation of an MPC problem tailored to greenhouse control. An introduction to the software tools used for implementation will be provided, where students can use to apply MPC to a simple greenhouse model and analyze simulate results.

George van Voorn

Lecture 1: Building Agent-Based Models — A Hands-On NetLogo Workshop

This interactive lecture introduces the basics of agent-based modelling, for which we will use NetLogo (see download link below). Agent-based models explicitly codify rules around interaction, decision making, and action taking of autonomous agents (such as humans or software agents) and their environment. Through a series of hands-on examples that you will rebuild on the spot we look at codified actions like agent moves, linking, and reproduction, and primitives like agent and patch properties. I also discuss how agent-based models are reported following good modelling practice and how NetLogo allows for basic sensitivity analysis of your built simulation model.

Participants will need to install NetLogo 6.4.

Lecture 2: From Rules to Behaviour — Advanced Simulation of Human and Agent Decisions

In the second lecture we will use examples to construct more advanced models that simulate semi-stochastic decision-making in heterogeneous agents, i.e. decision-making that follows rules that are based on e.g., economic, biological, or social theory but that include uncertainty and stochastic elements. We will also consider some models that represent human behaviour, like the Consumat.

Sequential Decision Making – Information

Anne Zander

Personal website

Stella Kapodistria

Personal website

Fenghui Yu

Personal website

Xiaodong Cheng

Personal website

George van Voorn

Citations

Abstracts

Anne Zander

Lecture 1: The unified framework for sequential decisions

Background information

Lecture 2: Policy evaluation and tuning

Stella Kapodistria

Lecture 1: Optimal Decision-Making under Parameter Uncertainty: A POMDP Approach

Lecture 2: Optimal Decision-Making under Parameter Uncertainty in Practice

Fenghui Yu

Lecture 1: Foundations of Stochastic Optimal Control and Connections to Reinforcement Learning

Lecture 2: Applications in Finance – From Stochastic Control to Algorithmic Trading

Xiaodong Cheng

Lecture 1: Introduction to Control Theory and Model Predictive Control

Lecture 2: Applications in Agriculture — Indoor Climate Control in Greenhouse

George van Voorn

Lecture 1: Building Agent-Based Models — A Hands-On NetLogo Workshop

Lecture 2: From Rules to Behaviour — Advanced Simulation of Human and Agent Decisions