View on GitHub

Fundamentals of inference and learning, EE-411

A Set of Lectures @ EPFL by Prof. Florent Krzakala

Year-end projects

The final project for this course will consist in a Reproducibility Challenge. That is, you will have to critically reproduce the results from a published machine learning paper. To be precise, we expect you to reproduce a couple of plots or tables from an established paper. Overall, the tasks for your final project assignment are:

  1. Attempt to reproduce numerically the results presented in the paper using the same settings described in this work, or slightly different ones.
  2. Explain why these results are important and put them into context.
  3. Review critically your attempt at reproduction, and discuss critically the paper in view of your own results.
  4. Propose new research directions based on your understanding.

Project guidelines

List of projects and papers

[1] Bad global minima exist and SGD can reach them, Shengchao Liu, Dimitris Papailiopoulos, Dimitris Achlioptas. click here Minima of the empirical risk found by the gradient descent method in deep learning are usually associated with good generalization properties. In this paper, however, the authors showed that this is not always the case, as there exist other minima with really terrible generalization properties. They also provide a simple method to find those bad minima in practice. We suggest to reproduce Fig. 1, but reproducing any other plot in the paper will be equally interesting.

[2] Reconciling modern machine learning practice and the bias-variance trade-off Mikhail Belkin, Daniel Hsu, Siyuan Ma, Soumik Mandal. click here This work showed that the generalization performance as a function of the number of parameters of a machine learning method can sometimes looked very different from its classical presentation. In particular, the authors describe what they call the “double descent” phenomenon, demonstrating that over-parameterization does not necessarily hurt generalization in machine learning. Many plots can be easily reproduced. We suggest Fig. 2 and Fig. 4, but many more are equally interesting.

[3] Understanding deep learning requires rethinking generalization Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, Oriol Vinyals. click here This paper was critical in showing that the standard wisdom regarding generalization was completely wrong at the time: By training neural networks with random labels, the authors debunked many classical beliefs about generalization, and sparked a new fruitful generation of works which critically studied generalization in deep learning. All figures from this paper are interesting and should/could be reproduced. However, you may find that it could be better to use simpler dataset, e.g., CIFAR or Fashion MNIST, rather than Imagenet!

[4] AugMix: A simple data processing method to improve robustness and uncertainty Dan Hendrycks, Norman Mu, Ekin D. Cubuk, Barret Zoph, Justin Gilmer, Balaji Lakshminarayanan. click here This paper proposes a new lightweight data augmentation scheme that can boost the robustness of a neural network to common corruptions. This is today one of the most popular data augmentation schemes used in the literature, and it is used as standard baseline in many studies. The results on CIFARx-C are of great interest. Table 1 and Figure 12 can be easily reproduced, although probably using different architectures than the ones provided in this work. A critical ablation study of the different parameters of AugMix could also be very interesting.

[5] Towards learning convolutions from scratch Behnam Neyshabur. click here This paper argues that radically changing the type of optimizer can imbue a simple fully connected neural network with the right inductive biases to classify complex images. Table 2 and Fig. 3 and Fig. 4 are of great interest.