A framework to efficiently smooth L1 penalties for linear regression

Hahn, G.; Lutz, S. M.; Laha, N.; Lange, C.

2020-09-19 bioinformatics

10.1101/2020.09.17.301788 bioRxiv

Show abstract

Penalized linear regression approaches that include an L1 term have become an important tool in statistical data analysis. One prominent example is the least absolute shrinkage and selection operator (Lasso), though the class of L1 penalized regression operators also includes the fused and graphical Lasso, the elastic net, etc. Although the L1 penalty makes their objective function convex, it is not differentiable everywhere, motivating the development of proximal gradient algorithms such as Fista, the current gold standard in the literature. In this work, we take a different approach based on smoothing in a fixed parameter setting (the problem size n and number of parameters p are fixed). The methodological contribution of our article is threefold: (1) We introduce a unified framework to compute closed-form smooth surrogates of a whole class of L1 penalized regression problems using Nesterov smoothing. The surrogates preserve the convexity of the original (unsmoothed) objective functions, are uniformly close to them, and have closed-form derivatives everywhere for efficient minimization via gradient descent; (2) We prove that the estimates obtained with the smooth surrogates can be made arbitrarily close to the ones of the original (unsmoothed) objective functions, and provide explicitly computable a priori error bounds on the accuracy of our estimates; (3) We propose an iterative algorithm to progressively smooth the L1 penalty which increases accuracy and is virtually free of tuning parameters. The proposed methodology is applicable to a large class of L1 penalized regression operators, including all the operators mentioned above. Although the resulting estimates are typically dense, sparseness can be enforced again via thresholding. Using simulation studies, we compare our framework to current gold standards such as Fista, glmnet, gLasso, etc. Our results suggest that our proposed smoothing framework provides predictions of equal or higher accuracy than the gold standards while keeping the aforementioned theoretical guarantees and having roughly the same asymptotic runtime scaling.

A framework to efficiently smooth L1 penalties for linear regression

Matching journals