# Statistical Learning Theory

### Course overview

The goal of statistical learning theory is to study, in a statistical framework, the properties of learning algorithms. This study serves a two-fold purpose. On one hand it provides strong guarantees for existing algorithms, and on the other hand suggests new algorithmic approaches that are potentially more powerful.

In this course we will go in detail into the theory and methods of statistical learning, and in particular complexity regularization (i.e., how do you choose the complexity of your model when you have to learn it from data). This issue is at the heart of the most successful and popular machine learning algorithms today, and it is critical for their success. In this course you'll learn some of the techniques needed for the analysis and near-optimal tuning of such algorithms.

## Lectures

Monday and Wednesday 2:40pm - 3:55pm (S.W. Mudd 644)

lecture 1 - 1/20/2010 - A Probabilistic Approach to Pattern Recognition
lecture 2 - 1/25/2010 - Introduction to Classification and Regression
lecture 3 - 1/27/2010 - Introduction Complexity Regularization
lecture 4 - 2/1/2010 - Denoising in Smooth Function Spaces
lecture 5 - 2/3/2010 - Plug-in Rules and the Consistency of the Histogram Classifier
lecture 6 - 2/8/2010 - Introduction to PAC learning bounds
lecture 7 - 2/10/2010 - PAC bounds and concentration of measure
• A good survey article, covering a multitude of concentration of measure results.
lecture 8 - 2/15/2010 - PAC bounds for general bounded losses

lecture 9 - part 1 - 2/17/2010 - PAC bounds for infinitely uncountable classes
lecture 9 - part 2- 2/17/2010 - Complexity Regularization Bounds
lecture 10 - 2/22/2010 - Decision Trees and Classification
lecture 11 - 2/24/2010 & 3/1/2001 - Complexity Regularization for the Squared Loss
lecture 12 - 3/7/2010 - Maximum Likelihood Estimation
lecture 13 - 3/10/2010 - Maximum Likelihood Estimation and Complexity Regularization
lecture 14 - 3/24/2010 - Denoising Smooth Functions with Unknown Smoothness
lecture 15 - 3/29/2010 - Approximating piecewise smooth functions
lecture 16 - 3/31/2010 - Denoising in piecewise smooth function spaces
lecture 17 - 4/5/2010 - Introduction to Vapnik-Chervonenkis (VC) Theory
lecture 18 - 4/7/2010 & 4/14/2010 - The proof of the Vapnik-Chervonenkis inequality.
lecture 19 - 4/21/2010 - Applications of VC theory.
lecture 20 - Minimax Lower Bounds

A .zip file with all the lecture notes and homework assignments is available here.

## Homework

Homework 1 - Due 2/3/2010 - You'll need the following file regression_dataset.txt
Homework 2 (revised) - Due 2/12/2010 (previous version) - File for the last question
Homework 3 - Due 2/26/2010 - A possible solution
Homework 4 - Due 3/5/2010
Homework 5 - Due 3/31/2010
Homework 6 - Due 4/14/2010

### Prerequisites

This course is intended for Ph.D. and M.S. students in engineering, statistics and computer science. Students interested in the course must display some mathematical maturity, and be comfortable with mathematical proofs. Students are assumed to have a good working knowledge of probability (statistics is a plus, but not needed).

### Format and Evaluation

The course is roughly structured as follows: We will begin with several lectures covering a lot of the foundational material, and then proceed by discussing recent developments in the area. Evaluation is going to be based on:
Class participation: (~20% of the grade) Students are expected to come to lecture and actively participate.
In-class presentation(s): (~30% of the grade) Each student will make a presentation in class of a statistical learning theory topic. This could be either a very recent topic, or a more mature topic that was not discussed in class before.
Homework/Project: (~50% of the grade) There will be some assignments through the semester, and possibly a project which might be connected to the in-class presentation above.

### Instructor:

Rui Castro
Web: http://www.ee.columbia.edu/~rmcastro
Phone: (+1) 212 854 0513
Office: 716 CEPSR

### Textbooks and References

There is no formal textbook for this course. Throughout the semester I’ll keep posting lecture notes. Some homeworks might involve reading assignments, where I ask you to read a paper and write a short report about it. Below is a list of relevant books, covering some of the materials in the course.
• A Probabilistic Theory of Pattern Recognition, Devroye, Gyorfi, Lugosi, Springer
• The Elements of Statistical Learning, Hastie, et al, Springer
• Combinatorial methods in density estimation, Devroye and Lugosi, Springer
• Statistical Learning Theory, Vapnik, Wiley
• An Introduction to Computational Learning Theory, Kearns and Vazirani, MIT Press

Although you will not need to know measure theoretical probability in depth for the course it is does help. Below are two references on the topic.

• Probability and Measure, Billingsley, Wiley
• A Probability Path, Resnick, Birkhäuser

### Tentative Syllabus

• Elements of Statistical Learning Theory
• Classification and Regression models
• Introduction to Complexity Regularization
• Denoising smooth functions
• Plug-in rules and the Histogram Classifier
• Probably Approximately Correct (PAC) Learning
• Concentration of Measure – Chernoff and Hoeffding inequalities
• Error bounds for Classification
• Decision Trees
• Complexity Regularization revisited – Regression
• Maximum Likelihood Estimation and Complexity Regularization
• Vapnik-Chervonenkis Theory
• Worst-Case/Minimax bounds

Home | Recent Changes | Locked Page

 Last change: Fri Dec-10-10 16:07:46 Inspired by roWiki
© Rui Castro - Columbia University 2009