Lasso regression under stochastic restrictions in linear regression: An application to genomic data


Genc M., Özkale M. R.

COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, cilt.53, sa.8, ss.2816-2839, 2024 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 53 Sayı: 8
  • Basım Tarihi: 2024
  • Doi Numarası: 10.1080/03610926.2022.2149243
  • Dergi Adı: COMMUNICATIONS IN STATISTICS-THEORY AND METHODS
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, Business Source Elite, Business Source Premier, CAB Abstracts, Compendex, Veterinary Science Database, zbMATH, Civil Engineering Abstracts
  • Sayfa Sayıları: ss.2816-2839
  • Anahtar Kelimeler: Bayesian information criterion, coordinate descent algorithm, genomic data, lasso, variable selection
  • Çukurova Üniversitesi Adresli: Evet

Özet

Variable selection approaches are often employed in high-dimensionality and multicollinearity problems. Since lasso selects variables by shrinking the coefficients, it has extensive use in many fields. On the other, we may sometime have extra information on the model. In this case, the extra information should be considered in the estimation procedure. In this paper, we propose a stochastic restricted lasso estimator in linear regression model which uses the extra information as stochastic linear restrictions. The estimator is a generalization of mixed estimator with L-1 type penalization. We give the coordinate descent algorithm to estimate the coefficient vector of the proposed method and strong rules for the coordinate descent algorithm to discard variables from the model. Also, we propose a method to estimate the tuning parameter. We conduct two real data analyses and simulation studies to compare the new estimator with several estimators including the ridge, lasso and stochastic restricted ridge. The real data analyses and simulation studies show that the new estimator enjoys the automatic variable selection property of the lasso while outperforms standard methods, achieving lower test mean squared error.