19208111 Masterseminar Stochastics
- FU-Students only need to register via Campus Management.
- Non-FU-students are required to register via Whiteboard.
Winter Term 2025/2026
Lecturer: Dr. Dave Jacobi, Dr. Guilherme de Lima Feltes
- Time and place: Thursdays, 16--18h, SR 119, Arnimallee 3
Prerequisites: Stochastics I and II. Desirable: Stochastics III.
Target Group: BMS Students, Master students of Mathematics and advanced Bachelor students of Mathematics.
Contents: The seminar covers advanced topics of stochastics.
Reinforcement learning lies at the core of many state-of-the-art artificial intelligence algorithms, enabling agents to solve complex optimal control tasks in robotics, finance, physical AI, drug discovery, computer games and many other applications.
This seminar offers a rigorous treatment of reinforcement learning, focusing on the mathematical principles that make reinforcement learning algorithms work. We will develop a mathematically sound understanding of Markov decision processes, value function based methods and their connections to stochastic optimal control, policy gradient methods, emphasizing convergence properties of classical reinforcement learning algorithms through stochastic approximation and stochastic gradient descent. We will also explore continuous time reinforcement learning in the framework of stochastic differential equations.
The seminar aims to provide a rigorous foundational perspective for students interested in current research related to reinforcement learning and artificial intelligence. Participants should have a strong background in mathematics specifically in probability theory.
Talks
| Date | Subject | Speaker |
| 16.10. | preliminary discussion | Dave Jacobi |
| 23.10. | Basics of discrete stochastic optimal control | Sichen Jiang |
| 30.10. | Principles of stochastic approximation | Chijie Zhou |
| 06.11. | Stochastic gradient descent methods (smooth case) | Wiktoria Krawczyk |
| 13.11. | Stochastic gradient descent methods (non-smooth case) | Bolkar Eren |
| 20.11. | The stochastic fixed point theorem | Lucie Knop |
| 27.11. | Policy gradient methods | Gauri Kshetry |
| 04.12. | Value function based methods | Dilara Kus |
| 11.12. | Actor critic methods | Anna Yuan |
| 18.12. | Monte carlo tree search | Jakob Zimmermann |
| 2026 | ||
| 08.01. | Convergence of value function based methods | Erva Yurtbas |
| 15.01. | Reinforcement learning in continuous time | Romain Akinlami |
| 22.01. | Policy evaluation and TD-learning in continuous time | N.N. |
| 29.01. | Policy gradient and actor critic in continuous time | N.N. |
| 05.02. | Q-learning in continuous time | N.N. |
| 12.02. | Gradient flow for regularized stochastic optimal control | N.N. |
Literature: will be announced in the preliminary discussion