ELEN 6887 - Spring 2010
Statistical Learning Theory
The goal of statistical learning theory is to study, in a statistical framework, the properties of learning algorithms. This study serves a two-fold purpose. On one hand it provides strong guarantees for existing algorithms, and on the other hand suggests new algorithmic approaches that are potentially more powerful.
In this course we will go in detail into the theory and methods of statistical learning, and in particular complexity regularization (i.e., how do you choose the complexity of your model when you have to learn it from data). This issue is at the heart of the most successful and popular machine learning algorithms today, and it is critical for their success. In this course you'll learn some of the techniques needed for the analysis and near-optimal tuning of such algorithms.
Monday and Wednesday 2:40pm - 3:55pm (S.W. Mudd 644)
lecture 1 - 1/20/2010 - A Probabilistic Approach to Pattern Recognition
lecture 2 - 1/25/2010 - Introduction to Classification and Regression
lecture 3 - 1/27/2010 - Introduction Complexity Regularization
lecture 4 - 2/1/2010 - Denoising in Smooth Function Spaces
lecture 5 - 2/3/2010 - Plug-in Rules and the Consistency of the Histogram Classifier
lecture 6 - 2/8/2010 - Introduction to PAC learning bounds
lecture 7 - 2/10/2010 - PAC bounds and concentration of measure
lecture 8 - 2/15/2010 - PAC bounds for general bounded losses
- A good survey article, covering a multitude of concentration of measure results.
lecture 9 - part 1 - 2/17/2010 - PAC bounds for infinitely uncountable classes
lecture 9 - part 2- 2/17/2010 - Complexity Regularization Bounds
lecture 10 - 2/22/2010 - Decision Trees and Classification
lecture 11 - 2/24/2010 & 3/1/2001 - Complexity Regularization for the Squared Loss
lecture 12 - 3/7/2010 - Maximum Likelihood Estimation
lecture 13 - 3/10/2010 - Maximum Likelihood Estimation and Complexity Regularization
lecture 14 - 3/24/2010 - Denoising Smooth Functions with Unknown Smoothness
lecture 15 - 3/29/2010 - Approximating piecewise smooth functions
lecture 16 - 3/31/2010 - Denoising in piecewise smooth function spaces
lecture 17 - 4/5/2010 - Introduction to Vapnik-Chervonenkis (VC) Theory
lecture 18 - 4/7/2010 & 4/14/2010 - The proof of the Vapnik-Chervonenkis inequality.
lecture 19 - 4/21/2010 - Applications of VC theory.
lecture 20 - Minimax Lower Bounds
A .zip file with all the lecture notes and homework assignments is available here.
Homework 1 - Due 2/3/2010 - You'll need the following file regression_dataset.txt
Homework 2 (revised) - Due 2/12/2010 (previous version) - File for the last question
Homework 3 - Due 2/26/2010 - A possible solution
Homework 4 - Due 3/5/2010
Homework 5 - Due 3/31/2010
Homework 6 - Due 4/14/2010
This course is intended for Ph.D. and M.S. students in engineering, statistics and computer science. Students interested in the course must display some mathematical maturity, and be comfortable with mathematical proofs. Students are assumed to have a good working knowledge of probability (statistics is a plus, but not needed).
Format and Evaluation
The course is roughly structured as follows: We will begin with several lectures covering a lot of the foundational material, and then proceed by discussing recent developments in the area. Evaluation is going to be based on:
Class participation: (~20% of the grade) Students are expected to come to lecture and actively participate.
In-class presentation(s): (~30% of the grade) Each student will make a presentation in class of a statistical learning theory topic. This could be either a very recent topic, or a more mature topic that was not discussed in class before.
Homework/Project: (~50% of the grade) There will be some assignments through the semester, and possibly a project which might be connected to the in-class presentation above.
Phone: (+1) 212 854 0513
Office: 716 CEPSR
Textbooks and References
There is no formal textbook for this course. Throughout the semester Ill keep posting lecture notes. Some homeworks might involve reading assignments, where I ask you to read a paper and write a short report about it. Below is a list of relevant books, covering some of the materials in the course.
- A Probabilistic Theory of Pattern Recognition, Devroye, Gyorfi, Lugosi, Springer
- The Elements of Statistical Learning, Hastie, et al, Springer
- Combinatorial methods in density estimation, Devroye and Lugosi, Springer
- Statistical Learning Theory, Vapnik, Wiley
- An Introduction to Computational Learning Theory, Kearns and Vazirani, MIT Press
Although you will not need to know measure theoretical probability in depth for the course it is does help. Below are two references on the topic.
- Probability and Measure, Billingsley, Wiley
- A Probability Path, Resnick, Birkhäuser
Elements of Statistical Learning Theory
Classification and Regression models
Introduction to Complexity Regularization
Denoising smooth functions
Plug-in rules and the Histogram Classifier
Probably Approximately Correct (PAC) Learning
Concentration of Measure Chernoff and Hoeffding inequalities
Error bounds for Classification
Complexity Regularization revisited Regression
Maximum Likelihood Estimation and Complexity Regularization