Information-Theoretic Learning (ITL)
Leiden University, Spring Semester 2017
The URL of this webpage is
www.cwi.nl/~pdg/teaching/inflearn. Visit this page regularly for
changes, updates, etc.
This course is on an interesting but complicated subject. It is given
at the master's or advanced bachelor's level. Although the only
background knowledge required is elementary probability theory, the
course does require serious work by the student. The course load is 6
ECTS. Click here (studiegids) for a general course description.
Many thanks are due to Steven de Rooij (Leiden University) who prepared a
significant proportion of the exercises.
Lectures and Exercise Sessions
Lectures take place each Tuesday from 13.45--15.45 in room 401
of the Snellius Building, Niels Bohrweg 1, Leiden. The lectures
are immediately followed by a mini-exercise session held by Rianne de Heide.
The first lecture will take place February 7th, 2017. There will be no
lectures on March 14th and April 18th. The last
official lecture is scheduled for May 16th, and the final exam is
provisionally scheduled for Monday May 29th, 14.00-17.00.
Weekly Homework: At every lecture on Tuesday except the first
there is a homework assignment. The assignment will also be made
available on this webpage. Homework is obligatory and must be
turned in at the beginning of the next lecture, i.e. one week after
the assignment was handed out. After the lecture,there is
(approximately) 30 minutes homework session, during which the homework
will be explained and discussed by teaching assistant Rianne de
Heide. Turning in written complete homework in time is required, see
Credit 6 ECTS points.
Examination form In order to pass the course, one must obtain
a sufficient grade (6 or higher) on
both of the
The final grade will be determined as the average of the
- An open-book written examination (to be held Monday May 29th).
- Homework. Each student must hand in solutions to homework
assignments at the beginning of the lecture after the homework was handed out.
the problems in the group is encouraged, but every participant must
write down her or his answers on her or his own. The final homework
grade will be determined as an average of the weekly grades.
Literature We will mainly use various chapters of the
following source: P. Grünwald. The Minimum Description Length
Principle, MIT Press, 2007. Some additional hand-outs will be made
available free of charge as we go. For the second week, this is Luckiness
and Regret in Minimum Description Length Inference, by Steven de
Rooij and Peter Grünwald, Handbook of the Philosophy of Science,
Volume 7: Philosophy of Statistics, 2011. This paper gives an overview
of the part of this course that will be concerned with the relation
between statistics, machine learning and data compression, as embodied
in MDL learning.
Lecture contents are subject to change at any time for any reason.
A more precise schedule, with links to exercises, will be
determined as we go.
- February 7: introduction
- General introduction: learning, regularity, data compression. Kolmogorov Complexity; deterministic vs. purely random vs. ``stochastic'' sequences.
- Literature: Chapter 1 up to Section 1.5.1.
- February 14: data compression without probability
- Learning of context-free grammars from example sentences.
- Basics of Lossless Coding. Prefix Codes.
- Bernoulli distributions, maximum likelihood.
- Literature: Chapter 2, Section 2.1.
Chapter 3, Section 3.1, Handout, Section 1.
- First Set of Homework Exercises
- February 21: Codes and Probabilities (the most important lecture!)
- The Kraft inequality. The most important insight of the class:
the correspondence between probability distributions and code length
functions. The information inequality, entropy, relative entropy
(Kullback-Leibler divergence). Shannon's coding theorem.
- Coding integers: probability vs. two-stage coding view.
- Literature: Chapter 3 (3.2,3.3,3.4)
- Second Set of Homework Exercises
- February 28: Preparatory Statistics.
- Maximum Likelihood and Bayesian Inference; Bayes Predictive Distribution
- Literature: Chapter 2, Section 2.2, 2.5.2, Section 4.4, Example 8.1. (!)
- Third Set of Homework Exercises
- March 7th:
- Coding with the help of the Bernoulli model, using index codes.
- Coding with the help of the Bernoulli model, using Shannon-Fano two-part codes.
- Coding with the help of the Bernoulli model, using Shannon-Fano Bayes mixture codes.
- Markov Models (Chains): Definition, Maximum Likelihood.
- Literature: Chapter 5 until 5.6.
- Fourth Set of Homework Exercises
- March 14th: No Lecture!
- March 21st: Universal Coding
- Now it really gets exciting!
- Regret, Minimax Regret, NML Universal Code for finite and
- Asymptotic expansion of KL divergence
- NML vs. Bayes universal code for parametric models. Jeffreys
prior Part I.
- Literature: Chapter 4, 4.1-4.3; Chapter 6, Section 6.1 and 6.2; Chapter 7, 7.1 and 7.2; Chapter 8, 8.1 and 8.2
- Fifth Set of Homework Exercises
- March 28:
- NML vs. Bayes universal code for parametric models. Jeffreys
prior Part II.
- Jeffreys' prior as a uniform prior on the space of distributions
equiped with the KL divergence.
- Literature: Chapter 6, Section 6.1 and 6.2; Chapter 7, 7.1 until 7.3.1; Chapter 8, 8.1 and 8.2 <
- Sixth Set of Homework Exercises
- April 4: Simple Refined MDL, Prequential Plugin Codes
- Simple Refined MDL with its many interpretations
- Prequential Interpretation of Simpe Refined MDL
- Prequential Plug-in Code
- NML regret, complexity as number of distinguishable distributions
- Literature: Chapter 9; Chapter 14, Section 14.1 and 14.2, esp. the box
on page 426.
- Seventh Set of Homework Exercises.
- April 11: General Refined MDL, Prediction with MDL, Issues with Universal Codes/MDL
- General Refined MDL
- MDL Prediction/Model
Selection/Estimation/Mixed 1-part/2-part Codes
- p-value interpretation
- Issues: undefined NML or Jeffreys' prior, Horizon (In)Dependence
Chapter 6, Section 6.4; Chapter 11, Section 11.4; Chapter 14, Section 14.1, 14.2, 14.3
- Eighth Set of Homework Exercises.
- April 18: No Lecture!
- April 25: Maximum Entropy
- Maximum Entropy Principle
- How to find MaxEnt distributions
- Exponential Families and Maximum Entropy
- Literature: Chapter 18, Section 18.1-18.4; Chapter 19, Section 19.5.1.
- Ninth Set of Homework Exercises.
- May 2nd: Excursion: Sequential Prediction with General Loss Functions
- May 9th: MaxEnt and MDL
- Canonical and Mean-Value Parameterization
- Robustness Property of Exponential Families
- Maximum Entropy and Minimum Description Length. The zero-sum coding game.
- Literature: Chapter 19, 19.1-19.3, 19.5.
- Eleventh Set of Homework Exercises.
- May 16th: Sequential Prediction, Part II and Overview/Wrap-Up.
- TUESDAY May 3013:45-16:45: Open-Book Examination in Room 401 of the Snellius building. Note this is a change from the original plan which was to hold the exam on May 29th!!!.
Here is a previous exam.
Peter Grünwald’s home