Statistical inference for the population landscape via moment‐adjusted stochastic gradients
Modern statistical inference tasks often require iterative optimization methods to compute the solution. Convergence analysis from an optimization viewpoint informs us only how well the solution is approximated numerically but overlooks the sampling nature of the data. In contrast, recognizing the randomness in the data, statisticians are keen to provide uncertainty quantification, or confidence, for the solution obtained by using iterative optimization methods. The paper makes progress along this direction by introducing moment‐adjusted stochastic gradient descent: a new stochastic optimization method for statistical inference. We establish non‐asymptotic theory that characterizes the statistical distribution for certain iterative methods with optimization guarantees. On the statistical front, the theory allows for model misspecification, with very mild conditions on the data. For optimization, the theory is flexible for both convex and non‐convex cases. Remarkably, the moment adjusting idea motivated from ‘error standardization’ in statistics achieves a similar effect to acceleration in first‐order optimization methods that are used to fit generalized linear models. We also demonstrate this acceleration effect in the non‐convex setting through numerical experiments.
No Supplementary Data
No Article Media