Mathematics of machine learning: from linear models to neural networks - part 2

PD Dr. Pavel Gurevich 

Schedule, Winter 2017/18


Monday 10.00-12.00, Arnimallee 6 (Pi-Building), SR 007/008

Language: English


Machine learning or, more generally, artificial intelligence is nowadays ubiquitous. Explicitly or implicitly, it surrounds us, hiding behind anything, ranging from smartphones and social networks to self-driving vehicles. Machine learning deals with searching for and generating patterns in data. Although it is traditionally considered a branch of computer science, it heavily relies on mathematical foundations. Thus, it is the primary goal of our seminar to understand these mathematical foundations. In doing so, we will mainly follow the classical monograph [1] and combine the two complementary viewpoints: deterministic and probabilistic. In this semester, we will focus on artificial neural networks, graphical models, and latent variables approach. All these machine learning methods are widely used nowadays and still belong to the fields of active research.

Doing exercises (that are present in [1] in abundance) and programming is beyond the seminar’s scope. However, the students are very much encouraged to do both on their own.

Interested students are supposed to be acquainted with basics of probability theory and linear deterministic and probabilistic models for regression and classification, see, e.g., [1, Chapters 2-4, 6, 7].

The language of the seminar is English.


The topics will be assigned during the first seminar.

In the list of (most) topics below, the numbers in brackets refer to the corresponding sections in [1]. As a complement, the monographs [2,3,4] are recommended.


Artificial neural networks:

  1. Feed-forward neural networks (5.1, 5.2, 5.3, 5.3)
  2. Regularization in neural networks (5.5)
  3. Bayesian neural networks (5.7)
  4. Convolutional networks ([3, Chap. 9])
  5. Recurrent networks ([3, Chap. 10])


Graphical models:

  1. Directed graphical models: Bayesian networks I (8.1)
  2. Directed graphical models: Bayesian networks II (8.2)
  3. Undirected graphical models: Markov random fields (8.3)
  4. Inference: chains, trees, factor graphs, the sum-product algorithm (8.4.0-8.4.4)
  5. Inference: the max-sum algorithm and algorithms for general graphs (8.4.5-8.4.8)


Discrete latent variables: clustering, mixture models and expectation maximization (EM)

  1. K-means clustering and EM for mixtures of Gaussians (9.1, 2.3.9, 9.2)
  2. Latent variables in the EM algorithm (9.3)
  3. General EM algorithm (9.4)


Continuous latent variables: principal component analysis (PCA)

  1. Deterministic PCA (12.1)
  2. Probabilistic PCA (12.2)