5th International Symposium on Innovative Approaches in Smart Technologies, Ankara, Türkiye, 28 - 29 Mayıs 2022, ss.38
COVID-19 is an infection that has affected the world since December 31, 2019, and was declared a pandemic by
WHO in March 2020. The COVID-19 pandemic has infected 465 million people and claimed more than 6 million lives. In this
study, Multi-Layer Perceptron (MLP), Tree Boost (TB), Radial Basis Function Network (RBF), Support Vector Machine (SVM),
and K-Means Clustering (kMC) individually combined with minimum redundancy maximum relevance (mRMR) and Relief-F
have been used to construct new feature selection-based COVID-19 prediction models and discern the influential variables for
prediction of COVID-19 infection. The dataset has information related to 20.000 patients (i.e., 10.000 positives, 10.000
negatives) and includes several variables, including age, sex, race, pregnancy, fever, breathing difficulty, cough, runny nose,
throat pain, diarrhea, headache, lung comorbidity, cardio comorbidity, renal comorbidity, diabetes comorbidity, smoking
comorbidity, and obesity comorbidity. The accuracy, recall, and F1-Score metrics have been used to assess the models’
performance, whereas the generalization errors of the models were evaluated using 10-fold cross-validation. The results show
that the average performance of mRMR is slightly better than Relief-F in predicting the COVID-19 infection of a patient. In
addition, mRMR is more successful than the Relief-F algorithm in finding the relative relevance order of the COVID-19
predictors. The mRMR algorithm emphasizes symptomatic variables such as fever and cough, whereas the Relief-F algorithm
emphasizes non-symptomatic variables such as age and race. It has also been observed that, in general, MLP outperforms all
other ML classifiers for predicting the COVID-19 infection.