On the Application of Artificial Intelligence and Feature Selection in Sports Science Education and Research: A Case Study

Creative Commons License

ÖZÇİLOĞLU M. M. , AKAY M. F. , Heil D.

5th Cyprus International Conference on Educational Research (CYICER), Kyrenia, CYPRUS, 31 Mart - 02 Nisan 2016, cilt.3, ss.256-262 identifier

  • Cilt numarası: 3
  • Doi Numarası: 10.18844/gjhss.v3i3.1561
  • Basıldığı Şehir: Kyrenia
  • Basıldığı Ülke: CYPRUS
  • Sayfa Sayıları: ss.256-262


In sports science education and research, the use of artificial intelligence methods along with feature selection algorithms can be of great help for developing prediction models where experimental studies based on measurements are not feasible. In this paper, we present a case study in regards to how sports science can benefit from the use of artificial intelligence methods combined with a feature selection algorithm. More specifically, the purpose of our study is to develop prediction models for upper body power (UBP), which is one of the most important factors affecting the performance of cross-country skiers during races. The dataset, which includes 75 subjects, was obtained from the College of Education, Health and Development of Montana State University. Multilayer Perceptron (MLP) and Single Decision Tree (SDT) along with the minimum-redundancy maximum-relevance (mRMR) feature selection algorithm were used to produce prediction models for predicting the 10-second UBP (UBP10) and 60-second UBP (UBP60). The predictor variables in the dataset are protocol, gender, age, body mass index (BMI), maximum oxygen uptake (VO(2)max), maximum heart rate (HRmax), time and heart rate at lactate threshold (HRLT) whereas UBP10 and UBP60 are the target variables. Based on the ranking scores of predictor variables assigned by the mRMR, 16 different prediction models have been developed. By using 10-fold cross-validation, the efficiency of the prediction models has been calculated with their multiple correlation coefficients (R's) and standard error of estimates (SEE's). The results show that using less amount of predictor variables than the full set of predictor variables can be useful for prediction of UBP10 and UBP60 with comparable error rates. The model consisting of the predictor variables gender, BMI, VO(2)max, HRLT and time yields the lowest SEE's for prediction of UBP10, while the model including the predictor variables gender, age, BMI and VO(2)max gives the lowest SEE's for prediction of UBP60, whichever regression method is used. Using these two models instead of the full set of predictor variables yields up to 4.95% and 6.83% decrement rates in SEE's for MLP and SDT based UBP prediction models, respectively.