
Statistical inference with highdimensional data
CunHui Zhang, Rutgers University
We consider a semi lowdimensional approach to statistical inference with highdimensional
data. The approach is best described with the following model statement:
model = lowdimensional component + highdimensional component.
The main objective of this approach is to develop asymptotically efficient statistical
inference procedures for the lowdimensional component, such as pvalues and confi
dence regions. Just as in semiparametric inference, a sufficiently accurate estimate of
the highdimensional component is required in order to carry out the inference for the
lowdimensional component. The feasibility of estimating the highdimensional component
at the required accuracy depends on the model complexity and illposedness,
signal strength, the type of lowdimensional inference problem under consideration, and
sometimes availability of certain ancillary information. We will consider linear regression
and Gaussian graphical model as primary examples. We will describe concave penalized
methods which take advantage of partial signal strength, strategies and algorithms of debiasing
the Lasso and concave penalized estimators, the sample size requirement for the
debiasing methods to work, and the contributions of unlabeled data in semisupervised
regression.
Slides:
Zhang_lectures.pdf

