Large-Scale Estimation and Testing Under Heteroscedasticity
Abstract: The simultaneous inference of many parameters, based on a corresponding set of observations, is a key research problem that has received much attention in the high-dimensional setting. Many practical situations involve heterogeneous data where the most common setting involves unknown effect sizes observed with heteroscedastic errors. Effectively pooling information across samples while correctly accounting for heterogeneity presents a significant challenge in large-scale inference. The first part of my talk addresses the selection bias issue in large-scale estimation problem by introducing the “Nonparametric Empirical Bayes Smoothing Tweedie” (NEST) estimator, which efficiently estimates the unknown effect sizes and properly adjusts for heterogeneity via a generalized version of Tweedie’s formula. The second part of my talk focuses on a parallel issue in multiple testing. We show that there can be a significant loss in information from basing hypothesis tests on standardized statistics rather than the full data. We develop a new class of heteroscedasticity–adjusted ranking and thresholding (HART) rules that aim to improve existing methods by simultaneously exploiting commonalities and adjusting heterogeneities among the study units. The common message in both NEST and HART is that the variance structure, which is subsumed under standardized statistics, is highly informative and can be exploited to achieve higher power in both shrinkage estimation and multiple testing problems.