Applied Statistics - 2013



The retake of the Final Exam will take place on: Monday June 24th, 13:00 to 16:00 UU BBL 007.

Course overview

The aim of this course it to obtain a broad knowledge of nonparametric methods in statistics. Many methods in statistics are parametric in nature. In this case the probability law of the data is assumed to be parametrized by a finite-dimensional parameter. The basic idea of nonparametric methods is to drop this often restrictive assumption. These methods thereby offer much more flexibility to model the data than classical parametric methods. The topics that we will cover in this course form a mix of classical distribution free methods and more modern topics. The focus is both on the application and theory of these methods. Examples will be illustrated using the statistical computing tools, namely the statistical computing package R. There are many tutorials on R on the internet, for example R for beginners and R short reference card.

Lectures

The lectures take place on Monday 13:00 - 15:00 in Buys Ballot Lab. room BBL 023, Utrecht.
After the lecture there is a one hour slot (15:00 - 16:00) in the same room, that will be used for discussing the exercises, course contents and related materials.

Lecture 1 - Introduction and the Empirical CDF
Lectures 2 and 3 - Goodness-of-Fit (GoF) tests
Lecture 4 - Permutation Methods
Lecture 5 - The Jackknife and the Bootstrap
  • Reading Material: Sections 3.1, 3.2, 3.3 and 3.4 (partially) of the book by Wasserman.
  • Relevant R-Scripts:
Lecture 6 - The Bootstrap continued
Lecture 7 - Smoothing: Introduction, basic concepts, and a simple illustrative example
Lecture 8 - Linear Smoothers - Introductions and Basics
  • Reading Material: Chapter 5 up to section 5.2 (inclusive), and also section 4.2.
Lecture 9 - The Nadaraya-Watson Estimator
Lecture 10 - Linear Smoothers continued
  • Choosing the regularization parameter
  • Validation approaches
  • Local Polynomial Estimators
  • Reading Material:
    • Sections 5.3 and 5.4 of Wasserman.
Lecture 11 - Smoothing Splines, and Variance Estimation
Lecture 12 Confidence Bands for Linear Smoothers, and Penalized Likelihood Methods
    * - In a first reading you can skip the concepts of KL, Hellinger and Affinity, and just read and interpret Corollary 1 without proof.
Lecture 13 Penalized Likelihood Methods and Adaptation to Unknown Smoothness

Homework

General Rules:
You should hand in your answers at the beginning of class (alternatively, only in exceptional cases, via email). Answers could be handwritten, but please write them clearly and be organized.
I encourage you to work together in groups of two or three students maximum. You can hand-in your answers as a group. However, if I note that a member of the group is not actively taking part in the discussion and writing of the answers I reserve the right to request them to write separate reports (copy-paste is not allowed).

If answers are handed in on time these will count for the final grade. Let E denote the exam grade and let H denote the combined homework grades (on a scale 0-10). The final grade F is computed as
F=max(0.75E+0.25H,E)  if  E>=5.0
F=E  if  E<5.0


Course materials

The contents of the course will be taken from various sources. Part of the material will follow closely the book
  • W: "All of Nonparametric Statistics" (Springer; author: L. Wasserman, "ISBN-10:" 0387251456)
  • N: Throughout the course there will be various handouts, articles and lecture notes.

Prerequisites:

Basic knowledge of probability and statistics, and a sufficient level of mathematical maturity.

Format and Evaluation:

Throughout the course you will be required to solve some homework exercises. Some of these will be theoretical, while others will be practical, and often require the use of computational tools. It is recommended that you use the statistical computing package R for these, but this is not mandatory, and you can use any other software you have access to and feel comfortable with. The homework exercises that are handed in in a timely fashion will be graded and provide extra credit points towards the final grade.

At the end of the course there is a final written examination (date is not yet set by MasterMath, but likely to be very early in June).

Instructor:

Rui Manuel Castro
Web: https://www.win.tue.nl/~rmcastro
Phone: (+31) 40 247 2499
Office: TU/e, MetaForum MF4.073
Email: (please include the text AS2013 in the subject of the email, as this makes it easier for me to sort course email from all the rest).

Tentative contents and organization:

The first 5 to 6 weeks are devoted to the following topics:
  • Introduction to nonparametric inference and the empirical distribution function (W, chapter 1 and 2.1)
  • Goodness of fit tests (N)
  • Permutation tests and rank tests (N)

Weeks 7 up till 13 will be used to cover various topics in non-parametric statistics, following roughly chapters 4 and 5 from W. These chapters are on smoothing and nonparametric regression. The simplest classical linear regression model assumes that the relation between a response variable $Y$ and a predictor variable $X$ can be modeled by a straight line. However, this may not be appropriate. Nonparametric regression aims to fit a curve while making as few assumptions as possible. We will discuss various approaches to this problem, such as local regression and penalized regression. Besides being practically relevant, these methods also raise mathematically interesting questions. If the outcome of an experiment is nonnormal, for example binary, the principles underlying these techniques can also be used. This leads to nonparametric logistic regression, or more generally, nonparametric generalized linear models. If time permits we will also treat the case of multiple predictors, leading to additive models.


Home | Recent Changes | Edit Page


Last change: Mon Jun-17-13 09:03:02 Inspired by roWiki
© Rui Castro 2013