Mathematics of machine learning: from linear models to neural networks - part 2
PD Dr. Pavel Gurevich, Dr. Hannes Stuke
Schedule, Winter 2017/18
Seminar:
Monday 10.00-12.00, Arnimallee 6 (Pi-Building), SR 007/008
Language: English
There will be an additional seminar on
January 31 (Wednesday), 14:00-16:00
Room A7/SR 031
Description
Machine learning or, more generally, artificial intelligence is nowadays ubiquitous. Explicitly or implicitly, it surrounds us, hiding behind anything, ranging from smartphones and social networks to self-driving vehicles. Machine learning deals with searching for and generating patterns in data. Although it is traditionally considered a branch of computer science, it heavily relies on mathematical foundations. Thus, it is the primary goal of our seminar to understand these mathematical foundations. In doing so, we will mainly follow the classical monograph [1] and combine the two complementary viewpoints: deterministic and probabilistic. In this semester, we will focus on artificial neural networks, graphical models, and latent variables approach. All these machine learning methods are widely used nowadays and still belong to the fields of active research.
Doing exercises (that are present in [1] in abundance) and programming is beyond the seminar’s scope. However, the students are very much encouraged to do both on their own.
Interested students are supposed to be acquainted with basics of probability theory and linear deterministic and probabilistic models for regression and classification, see, e.g., [1, Chapters 2-4, 6, 7].
The language of the seminar is English.
Topics
The topics will be assigned during the first seminar.
In the list of (most) topics below, the numbers in brackets refer to the corresponding sections in [1]. As a complement, the monographs [2,3,4] are recommended.
Artificial neural networks:
- Feed-forward neural networks (5.1, 5.2, 5.3, 5.3)
- Regularization in neural networks (5.5)
- Bayesian neural networks (5.7)
- Convolutional networks ([3, Chap. 9])
- Recurrent networks ([3, Chap. 10])
Graphical models:
- Directed graphical models: Bayesian networks I (8.1)
- Directed graphical models: Bayesian networks II (8.2): presentation 1, presentation 2
- Undirected graphical models: Markov random fields (8.3)
- Inference: chains, trees, factor graphs, the sum-product algorithm (8.4.0-8.4.4)
- Inference: the max-sum algorithm and algorithms for general graphs (8.4.5-8.4.8)
Discrete latent variables: clustering, mixture models and expectation maximization (EM)
- K-means clustering and EM for mixtures of Gaussians (9.1, 2.3.9, 9.2)
- Latent variables in the EM algorithm (9.3)
- General EM algorithm (9.4)
Continuous latent variables: principal component analysis (PCA)
- Deterministic PCA (12.1)
- Probabilistic PCA (12.2)
Literature
- [1] Christopher M. Bishop, Pattern recognition and machine learning, 2006.
- [2] Kevin P. Murphy, Machine learning. A probabilistic approach, 2012.
- [3] I. Goodfellow, Y. Bengio, A. Courville, Deep Learning, 2016
- [4] D. Koller, N. Friedman, Probabilistic Graphical Models, 2009
See also at the homepage of Hysteresis Dynamics.