Theoretical Statistics and Mathematics Unit, ISI Delhi
A high-dimensional model is one where the number of parameters is larger than the number of observations. One then needs to use a form of complexity regularization in order to consistently estimate these many parameters. We will focus on $L_1$-regularization and more generally on sparsity-inducing penalties. A parameter vector is sparse if it has only few non-zero (or very small) entries.
One of the aims of these regularized estimation methods is to arrive at so-called oracle inequalities, which tell you that the estimator mimicks the one you would use if you knew which parameters are redundant. We will explain this for the linear regression model, using the Lasso, and then extend this to non-linear models and general structured sparsity penalties.
The oracle inequalities in turn can be used to construct asymptotic confidence intervals for the parameters. We will show how this can be done for the linear and for the graphical model where one estimates the entries of a high-dimensional covariance matrix.
We pay some special attention to misspecification where the (high-dimensional) model is wrong. We discuss the interpretation of the estimators and confidence intervals from a learning point of view.
See the lecture notes at arXiv for more details.