Machine learning and statistical methods for the prediction of maximal oxygen uptake: recent advances

Abut F., AKAY M. F.

MEDICAL DEVICES-EVIDENCE AND RESEARCH, vol.8, pp.369-379, 2015 (ESCI) identifier identifier identifier

  • Publication Type: Article / Review
  • Volume: 8
  • Publication Date: 2015
  • Doi Number: 10.2147/mder.s57281
  • Journal Indexes: Emerging Sources Citation Index (ESCI), Scopus
  • Page Numbers: pp.369-379
  • Keywords: machine learning methods, maximal oxygen consumption, prediction models, feature selection, COLLEGE-AGED PARTICIPANTS, VO2MAX, REGRESSION, VO(2)MAX, DYNAMICS, MODEL, DIEL
  • Çukurova University Affiliated: Yes


Maximal oxygen uptake (VO(2)max) indicates how many milliliters of oxygen the body can consume in a state of intense exercise per minute. VO(2)max plays an important role in both sport and medical sciences for different purposes, such as indicating the endurance capacity of athletes or serving as a metric in estimating the disease risk of a person. In general, the direct measurement of VO(2)max provides the most accurate assessment of aerobic power. However, despite a high level of accuracy, practical limitations associated with the direct measurement of VO(2)max, such as the requirement of expensive and sophisticated laboratory equipment or trained staff, have led to the development of various regression models for predicting VO(2)max. Consequently, a lot of studies have been conducted in the last years to predict VO(2)max of various target audiences, ranging from soccer athletes, nonexpert swimmers, cross-country skiers to healthy-fit adults, teenagers, and children. Numerous prediction models have been developed using different sets of predictor variables and a variety of machine learning and statistical methods, including support vector machine, multilayer perceptron, general regression neural network, and multiple linear regression. The purpose of this study is to give a detailed overview about the data-driven modeling studies for the prediction of VO(2)max conducted in recent years and to compare the performance of various VO(2)max prediction models reported in related literature in terms of two well-known metrics, namely, multiple correlation coefficient (R) and standard error of estimate. The survey results reveal that with respect to regression methods used to develop prediction models, support vector machine, in general, shows better performance than other methods, whereas multiple linear regression exhibits the worst performance.