View on GitHub

Fundamentals of inference and learning, EE-411

A Set of Lectures @ EPFL by Prof. Florent Krzakala

This is an introductory course in the theory of statistics, inference, and machine learning, with an emphasis on theoretical understanding & practical exercises. The course will combine, and alternate, between mathematical theoretical foundations and practical computational aspects in python.

Professor: Florent Krzakala

Teaching Assistants: Davide Ghio, Ortiz Jimenez Guillermo, Dimitriadis Nikolaos, Luca Pesce


The topics will be chosen from the following basic outline:

For students: Moodle Link & videos of the course on TubeSwitch

Discussions: You can discuss and ask questions on the course. We use slack, which is a great platform for this, here is the invitation to join the forum forum on slack which is valid until the end of october.

Lecture List:

Short video on introduction and course information

This first class is a recap on probability theory that will serve us well in this class. A good reference, and an absolutly recommended reading, for this lecture is Chap. 1-5 in All of statistics by Wasserman.

This second class is focused on the theory of maximum likelihood estimation. There are many good references on the topic, including for instance chap. 9 in All of statistics, or for the Bayesian point of view, MacKay chap 2 and 3.

A good read on supervised statistical learning is chapter 2 in An Introduction to Statistical Learning by James, Witten, Hastie and Tibshirani. They also discuss in detail K-neareast neighbors.

Gradient descent is the workhorse of all modern machine learning methods. There are many ressourse on gradient descent, from pedagogical ones to technical ones. Proximal operators are very powerful and are well described in this set of lectures Tibshirani1, Tibshirani2,Tibshirani3.

Linear methods are the simplest among all parametric methods, but are still extremly useful! A good discussion of OLD, Ridge and LASSO can be found in Chap 6, section 2 in An Introduction to Statistical Learning. Another good reference is this one. Linear classification methods are also at the center of machine learning technics, and are discussed in details in chapter 4 of An Introduction to Statistical Learning.

Richers features maps that linear ones, and Kernel methods, are one of the most important aspect of supervised machine learning. Michael Jordan’s notes on kernel are a good reference. The review from Hofmann, Scholkopf and Smola is also very complete. Scikit-learn has a detailed and very efficient implementation.

Over the last decades, neural networks have made quite an impact, one might even say that they are at the origin of a revolution in machine learning and artificial intelligence. This simple website allows you to get intuition on how they actually work for simple dataset: Tensorflow playground. The universal approximation theorem is discussed in many references (see for instance here). Despite Backpropagation being a rather trivial application of the chain rule of derivatives from Newton and Liebnitz notes, it is the cornerstone of learning neural network. A good summary of gradient decent algorithms is here. Convnets have made quite an impact, and have revolutionized computer vision, see the nice introduction by Yann Lecun.

There are many ressource on the topic online, and many books on this topic, which would deserve a course in its own. Nevertheless, it is good to have a basic understanding of where we stand theoretically and to have grasp of the notion of VC dimension.

Principal Component Analysis is (still) one of the most fundamental tool of machine learning. This post has great visual example, that you can play with to get an intuition.

Scikit learn has a good implemenation of k-means. Generative models are fundamental part of machine learning. The connection between Mixture of Gaussians and k-means clustering is well explained in David MacKay’s book page 300. The book is a very useful reference on this topic and probability in general (for instance Monte-Carlo methods adiscussed page 357). Boltzmann machines are discussed in in many places, for instances here and there. Generative Adversarial networks are very fashionable these days (check out This Person does not exists!). An introduction in pytorch is available here.

Ensembling methods are reallyan efficeint appraoch to machine learning. Here is a good reference for Adaboost. The book Introduction au Machine Learning has a great chapter on bagging and boosting.

RNN are still very useful (even though these days transformers are to be taking the lead!). We used extensivly the following introduction. A simple RNN implementaion for learning to add number in keras is given here.

Reiniforcement learning us certianly one of the most interesting direction these days. You can find a simple implementaion of q-learning herefor Frozen lake and of policy gradient for cartpole. The nature paper on alpha go is a fasicnating read on the new era of reinforcemecent learning.

Lab classes:


Projects (Due before February 5!):List of projects

A list of references

Course Policies


Two good options to run python online are EPFL Noto & Google Colab. Noto is EPFL’s JupyterLab centralized platform. It allows teachers and students to use notebooks without having to install python on their computer. Google colab provides a similar solution, with the added avantage that it gives access to GPUs. For instnace, you can open the jupyter notebook corresponding to the first exercice by a) opening google colab in your browser) b) selecting github, and c) writing the path

TP0 provides a short introduction. If you need more and really need to study python, here is a a good Python and NumPy Tutorial.

If you cannot compile LaTeX on your own computer (and even if you can, this is often a good strategy anyway), EPFL is providing Overleaf Professional accounts for all students: Overleaf EPFL . With Overleaf you can write and compile LaTeX directly from your web browser.