Statistics for Data Science (Wintersemester 2024/25)
News
- More exercises about the course contents were requested during the exam feedback session. You can have a look here or here. These materials are not part of this course, but they deal with similar topics and may be helpful for exam preparation.
- The make-up course exam will be held February 24 in room A6/032. Please note that the exam starts already at 10:00. You are allowed to bring a single handwritten DIN A4 page with personal notes (you may use both sides of the sheet). Other aids (e.g., lecture notes, calculators, cell phones or other electronics) are not permitted. Please bring your student ID card and your identity card (or passport or driving license) for us to verify your identity. You do not need to register for the exam. For further information, please see the exam guide.
- A mock exam and a "bonus" exercise sheet have been published. Please note that the mock exam and the bonus exercises will not be graded and do not need to be returned.
Dates
Lectures | Mon 10:15-11:45 | A6/032 | Dr. Vesa Kaarnioja |
Exercises | Tue 10:15-11:45 | A7/031 | Dr. Vesa Kaarnioja |
Course exam | Mon February 10, 2025 10:00-12:00 |
A6/032 | |
Make-up exam | Mon February 24, 2025 10:00-12:00 |
A6/032 |
General Information
Description
This course serves as an introduction to foundational aspects of modern statistical data analysis. Frequentist and Bayesian inference are presented from the perspective of probabilistic modeling. The course will consist of three main parts:
- Probability foundations: probability spaces, random variables, distribution of a random variable, expectation and covariance, important limit theorems and inequalities.
- Frequentist inference: point estimators, confidence intervals, hypothesis testing.
- Bayesian inference: conjugate inference, numerical models, data assimilation.
Prerequisites
Basic set theory (inclusion, union, intersection, difference of sets), basic analysis (infinite series, calculus), matrix algebra, some knowledge of probabilistic foundations (discrete probability, Gaussian distributions) is helpful.
Completing the course
The conditions for completing this course are (1) successfully completing at least 60% of the course's exercises, and (2) successfully passing the course exam.
Registration
Please register to the course via Campus Management (CM), then you will be automatically registered in MyCampus/Whiteboard as well. Please note the deadlines indicated there. For further information and in case of any problems, please consult the Campus Management's Help for Students.
Lecture notes
Lecture notes will be published here after each week's lecture.
- Week 1: Introduction, probability space, conditional probability, independence of events
- Week 2: Random variables
- Week 3: Joint distributions, independent random variables, conditional distribution, transformations of random variables
- Week 4: Expected value and covariance
- Week 5: Inequalities and limits
- Week 6: Introduction to statistical inference, descriptive statistics, confidence interval
- Week 7: Hypothesis testing: t-tests, variance tests, nonparametric tests
- Week 8: Proportion tests, normality tests, chi-squared tests
- Week 9: Analysis of variance and the Kruskal-Wallis test
- Week 10: Correlation and dependence in statistics, linear regression
- Week 11: Tests and confidence intervals for linear regression, multivariate linear regression, maximum likelihood estimator
- Week 12: Brief overview of Bayesian inference (files: source.py)
- Week 13: Linear Gaussian setting and the Kalman filter (files: deconv.py)
- Week 14: Markov Chain Monte Carlo (files: mh.py, gibbs.py)
Exercise sheets
Weekly exercises will be published here after each lecture.
- Assignment Submission Guidelines
- Exercise 1 (model solutions, w1e2.py)
- Exercise 2 (model solutions, w2e1.py, w2e2.py, w2e3.py)
- Exercise 3 (model solutions)
- Exercise 4 (model solutions, w4e2.py, w4e3.py, note on least squares regression)
- Exercise 5 (files: mtcars.txt, mtcars.xlsx, HW.txt, FT.txt, BP2.txt, model solutions: w5e1.py, w5e2.py, w5e3.py, w5e4.py)
- Exercise 6 (files: FT2.txt, iris.txt, mtcars.txt, model solutions: w6e1.py, w6e2.py, w6e3.py, w6e4.py)
- Exercise 7 (files: patients.txt, model solutions: w7e1.py, w7e2.py, w7e3.py, w7e4.py)
- Exercise 8 (files: group.txt, model solutions: w8e1.pdf, w8e2.py, w8e3.py, w8e4.py)
- Exercise 9 (files: shoeheight.txt, model solutions: w9e2.py, w9e3.py, w9e4.py)
- Exercise 10 (model solutions, w10e1.py, w10e2.py)
- Exercise 11 (model solutions)
- Exercise 12 (model solutions)
- Exercise 13 (model solutions)
Additional materials
- Mock exam (model solutions)
- Bonus exercises (files: signal.mat)
Please note that the mock exam and the bonus exercises will not be graded and do not need to be returned.
Contact
Dr. Vesa Kaarnioja | vesa.kaarnioja@fu-berlin.de | Arnimallee 6, room 212 Consulting hours: By appointment |
Literature
- Larry Wasserman. All of Statistics: A Concise Course in Statistical Inference. Springer Science & Business Media, 2004.
- Morris H. DeGroot and Mark J. Schervish. Probability and Statistics. 4th edition, Pearson Education, 2013.
- José M. Bernardo and Adrian F. M. Smith. Bayesian Theory. 2nd edition, Wiley, 2007.
- Leonhard Held and Daniel Sabanés Bové. Applied Statistical Inference: Likelihood and Bayes. Springer Science & Business Media, 2013.
- Sebastian Reich and Colin Cotter. Probabilistic Forecasting and Bayesian Data Assimilation. Cambridge University Press, 2015.