Back

A framework to efficiently smooth L1 penalties for linear regression

Hahn, G.; Lutz, S. M.; Laha, N.; Lange, C.

2020-09-19 bioinformatics
10.1101/2020.09.17.301788 bioRxiv
Show abstract

Penalized linear regression approaches that include an L1 term have become an important tool in statistical data analysis. One prominent example is the least absolute shrinkage and selection operator (Lasso), though the class of L1 penalized regression operators also includes the fused and graphical Lasso, the elastic net, etc. Although the L1 penalty makes their objective function convex, it is not differentiable everywhere, motivating the development of proximal gradient algorithms such as Fista, the current gold standard in the literature. In this work, we take a different approach based on smoothing in a fixed parameter setting (the problem size n and number of parameters p are fixed). The methodological contribution of our article is threefold: (1) We introduce a unified framework to compute closed-form smooth surrogates of a whole class of L1 penalized regression problems using Nesterov smoothing. The surrogates preserve the convexity of the original (unsmoothed) objective functions, are uniformly close to them, and have closed-form derivatives everywhere for efficient minimization via gradient descent; (2) We prove that the estimates obtained with the smooth surrogates can be made arbitrarily close to the ones of the original (unsmoothed) objective functions, and provide explicitly computable a priori error bounds on the accuracy of our estimates; (3) We propose an iterative algorithm to progressively smooth the L1 penalty which increases accuracy and is virtually free of tuning parameters. The proposed methodology is applicable to a large class of L1 penalized regression operators, including all the operators mentioned above. Although the resulting estimates are typically dense, sparseness can be enforced again via thresholding. Using simulation studies, we compare our framework to current gold standards such as Fista, glmnet, gLasso, etc. Our results suggest that our proposed smoothing framework provides predictions of equal or higher accuracy than the gold standards while keeping the aforementioned theoretical guarantees and having roughly the same asymptotic runtime scaling.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
Bioinformatics
1061 papers in training set
Top 0.5%
37.8%
2
Statistics in Medicine
34 papers in training set
Top 0.1%
6.8%
3
BMC Bioinformatics
383 papers in training set
Top 2%
4.9%
4
Biostatistics
21 papers in training set
Top 0.1%
3.7%
50% of probability mass above
5
Biometrics
22 papers in training set
Top 0.1%
3.6%
6
Nature Communications
4913 papers in training set
Top 39%
3.6%
7
PLOS ONE
4510 papers in training set
Top 39%
3.6%
8
The Annals of Applied Statistics
15 papers in training set
Top 0.1%
3.6%
9
PLOS Computational Biology
1633 papers in training set
Top 12%
2.7%
10
Scientific Reports
3102 papers in training set
Top 45%
2.6%
11
Briefings in Bioinformatics
326 papers in training set
Top 4%
1.8%
12
Journal of Computational Biology
37 papers in training set
Top 0.2%
1.8%
13
NeuroImage
813 papers in training set
Top 4%
1.8%
14
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 35%
1.5%
15
PLOS Genetics
756 papers in training set
Top 11%
1.2%
16
Frontiers in Genetics
197 papers in training set
Top 7%
1.0%
17
Algorithms for Molecular Biology
15 papers in training set
Top 0.1%
0.8%
18
Communications Biology
886 papers in training set
Top 26%
0.7%
19
BMC Genomics
328 papers in training set
Top 6%
0.7%
20
IEEE Transactions on Computational Biology and Bioinformatics
17 papers in training set
Top 0.8%
0.6%
21
Frontiers in Computational Neuroscience
53 papers in training set
Top 2%
0.6%
22
Imaging Neuroscience
242 papers in training set
Top 4%
0.6%
23
Interface Focus
14 papers in training set
Top 0.4%
0.6%