# Mathematics of machine learning: from linear models to neural networks

**PD Dr. Pavel Gurevich, Dr. Hannes Stuke**

**Schedule, Summer 2017**

**Seminar**:

Monday 10.00-12.00, 1.3.21 Seminarraum T1 / Arnimallee 14 (Physikgebaeude)

Language: English

**Description**

Machine learning or, more generally, artificial intelligence is nowadays ubiquitous. Explicitly or implicitly, it surrounds us, hiding behind anything, ranging from smartphones and social networks to self-driving vehicles. Essentially, machine learning deals with searching for and generating patterns in data. Although it is traditionally considered a branch of computer science, it heavily relies on mathematical foundations. Thus, it is a primary goal of our seminar to understand these mathematical foundations. In doing so, we will mainly follow the classical monograph [1] and combine the two complementary viewpoints: deterministic and probabilistic. In the list of topics below, the numbers in brackets refer to the corresponding sections in [1]. As a complement, the monographs [2] and [3, section 5] are recommended. More sections in [3] will be used in the next semester.

Doing exercises (which are present in [1] in abundance) and programming is beyond the seminar’s scope, however, the students are not forbidden and even strongly encouraged to do both.

In this semester, we will focus on linear models and their straightforward nonlinear generalizations. In the next semester, we plan to elaborate on genuinely nonlinear models such as neural networks and graphical models.

The language of the seminar is English.

Interested students are supposed to be acquainted with basics of probability theory, which can be refreshed, e.g., by reading sections 1.2.0-1.2.4 and section 2 in the book [1].

## Topics

1. Decision theory (1.5). Linear regression: maximum likelihood and least squares (3.1, 3.1.1, 3.1.2)

2. The bias-variance decomposition (3.2). Bayesian linear regression (3.3)

3. Bayesian Model Comparison (3.4) and Evidence approximation (3.6)

4. Linear classification: Fisher’s linear discriminant (4.1.0, 4.1.1, 4.1.4)

5. Linear classification: probabilistic generative models (4.2.0, 4.2.1, 4.2.2)

6. Linear classification: probabilistic discriminative models (4.3.2, 4.3.5)

7. Bayesian logistic regression (4.4, 4.5)

8. Kernel methods: deterministic viewpoint (6.1-6.3)

9. Kernel methods: probabilistic approach (6.4.0-6.4.6)

10. Sparse kernel machines. Support vector machines: deterministic approach (7.1)

11. Sparse kernel machines. Relevance vector machines: probabilistic approach (7.2)

## Participants

## Literature

- [1] Christopher M. Bishop, Pattern recognition and machine learning, 2006.
- [2] Kevin P. Murphy, Machine learning. A probabilistic approach, 2012.
- [3] I. Goodfellow, Y. Bengio, A. Courville, Deep Learning, 2016

**Transcribed from our partner homepage of Prof. Bernold Fiedler Nonlinear Dynamics. See also at the homepage of Hysteresis Dynamics.**