Predicting COVID-19 Infection Using Machine Learning Methods Combined with Feature Selection


Çetin U. A., Abut F.

Avrupa Bilim ve Teknoloji Dergisi, cilt.37, ss.52-58, 2022 (Hakemli Dergi)

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 37
  • Basım Tarihi: 2022
  • Doi Numarası: 10.31590/ejosat.1132337
  • Dergi Adı: Avrupa Bilim ve Teknoloji Dergisi
  • Derginin Tarandığı İndeksler: TR DİZİN (ULAKBİM)
  • Sayfa Sayıları: ss.52-58
  • Çukurova Üniversitesi Adresli: Evet

Özet

COVID-19 is an infection that has affected the world since December 31, 2019, and was declared a pandemic by WHO in March 2020. In this study, Multi-Layer Perceptron (MLP), Tree Boost (TB), Radial Basis Function Network (RBF), Support Vector Machine (SVM), and K-Means Clustering (kMC) individually combined with minimum redundancy maximum relevance (mRMR) and Relief-F have been used to construct new feature selection-based COVID-19 prediction models and discern the influential variables for prediction of COVID-19 infection. The dataset has information related to 20.000 patients (i.e., 10.000 positives, 10.000 negatives) and includes several personal, symptomatic, and non-symptomatic variables. The accuracy, recall, and F1-score metrics have been used to assess the models’ performance, whereas the generalization errors of the models were evaluated using 10-fold cross-validation. The results show that the average performance of mRMR is slightly better than Relief-F in predicting the COVID-19 infection of a patient. In addition, mRMR is more successful than the Relief-F algorithm in finding the relative relevance order of the COVID-19 predictors. The mRMR algorithm emphasizes symptomatic variables such as fever and cough, whereas the Relief-F algorithm highlights non-symptomatic variables such as age and race. It has also been observed that, in general, MLP outperforms all other classifiers for predicting the COVID-19 infection.