19208111 Masterseminar Stochastics

FU-Students only need to register via Campus Management.
Non-FU-students are required to register via Whiteboard.

Winter Term 2025/2026

Lecturer: Dr. Dave Jacobi, Dr. Guilherme de Lima Feltes

Time and place: Thursdays, 16--18h, SR 119, Arnimallee 3

Prerequisites: Stochastics I and II. Desirable: Stochastics III.
Target Group: BMS Students, Master students of Mathematics and advanced Bachelor students of Mathematics.

Contents: The seminar covers advanced topics of stochastics.

Reinforcement learning lies at the core of many state-of-the-art artificial intelligence algorithms, enabling agents to solve complex optimal control tasks in robotics, finance, physical AI, drug discovery, computer games and many other applications.
This seminar offers a rigorous treatment of reinforcement learning, focusing on the mathematical principles that make reinforcement learning algorithms work. We will develop a mathematically sound understanding of Markov decision processes, value function based methods and their connections to stochastic optimal control, policy gradient methods, emphasizing convergence properties of classical reinforcement learning algorithms through stochastic approximation and stochastic gradient descent. We will also explore continuous time reinforcement learning in the framework of stochastic differential equations.
The seminar aims to provide a rigorous foundational perspective for students interested in current research related to reinforcement learning and artificial intelligence. Participants should have a strong background in mathematics specifically in probability theory.

Talks

Date	Subject	Speaker
16.10.	preliminary discussion	Dave Jacobi
23.10.	Basics of discrete stochastic optimal control	Sichen Jiang
30.10.	Principles of stochastic approximation	Chijie Zhou
06.11.	Stochastic gradient descent methods (smooth case)	Wiktoria Krawczyk
13.11.	Stochastic gradient descent methods (non-smooth case)	Bolkar Eren
20.11.	The stochastic fixed point theorem	Lucie Knop
27.11.	Policy gradient methods	Gauri Kshetry
04.12.	Value function based methods	Dilara Kus
11.12.	Actor critic methods	Anna Yuan
18.12.	Monte carlo tree search - canceled -	Jakob Zimmermann
2026
08.01.	Convergence of value function based methods	Erva Yurtbas
15.01.	Reinforcement learning in continuous time	Romain Akinlami
29.01.	Policy evaluation and TD-learning in continuous time	Jiang Sichen
12.02.	Q-learning in continuous time	Khairullah Yusefi

Literature: will be announced in the preliminary discussion

Fachbereich Mathematik und Informatik

Dynamische Systeme / Stochastik

19208111 Masterseminar Stochastics

Winter Term 2025/2026

Talks