A robust ensemble feature selector based on rank aggregation for developing new VO(2)max prediction models using support vector machines


Abut F., AKAY M. F., George J.

TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, cilt.27, sa.5, ss.3648-3664, 2019 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 27 Sayı: 5
  • Basım Tarihi: 2019
  • Doi Numarası: 10.3906/elk-1808-138
  • Dergi Adı: TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, TR DİZİN (ULAKBİM)
  • Sayfa Sayıları: ss.3648-3664
  • Anahtar Kelimeler: Ensemble feature selection, rank aggregation, support vector machine, maximal oxygen uptake, prediction
  • Çukurova Üniversitesi Adresli: Evet

Özet

This paper proposes a new ensemble feature selector, called the majority voting feature selector (MVFS), for developing new maximal oxygen uptake (VO(2)max) prediction models using a support vector machine (SVM). The approach is based on rank aggregation, which meaningfully utilizes the correlation among the relevance ranks of predictor variables given by three state-of-the-art feature selectors: Relief-F, minimum redundancy maximum relevance (mRMR), and maximum likelihood feature selection (MLFS). By applying the SVM combined with MVFS on a self-created dataset containing maximal and submaximal exercise data from 185 college students, several new hybrid VO(2)max prediction models have been created. To compare the performance of the proposed ensemble approach on prediction of VO(2)max, SVM-based models with individual combinations of Relief-F, mRMR, and MLFS as well as with other alternative ensemble feature selectors from the literature have also been developed. The results reveal that MVFS outperforms other individual and ensemble feature selectors and yields up to 8.76% increment and 11.15% decrement rates in multiple correlation coefficients (Rs) and root mean square errors (RMSEs), respectively. Furthermore, in addition to reconfirming the relevance of sex, age, and maximal heart rate in predicting VO(2)max, which were previously reported in the literature, it is revealed that submaximal heart rates and exercise times at 1.5-mile distance are two further discriminative predictors of VO(2)max. The results have also been compared to those obtained by a general regression neural network and single decision tree combined with MVFS, and it is shown that the SVM exhibits much better performance than other methods for prediction of VO(2)max.