EXPERT SYSTEMS WITH APPLICATIONS, cilt.62, ss.91-103, 2016 (SCI-Expanded)
"Dimensionality" is one of the major problems which affect the quality of learning process in most of the machine learning and data mining tasks. Having high dimensional datasets for training a classification model may lead to have "overfitting" of the learned model to the training data. Overfitting reduces generalization of the model, therefore causes poor classification accuracy for the new test instances. Another disadvantage of dimensionality of dataset is to have high CPU time requirement for learning and testing the model. Applying feature selection to the dataset before the learning process is essential to improve the performance of the classification task. In this study, a new hybrid method which combines artificial bee colony optimization technique with differential evolution algorithm is proposed for feature selection of classification tasks. The developed hybrid method is evaluated by using fifteen datasets from the UCI Repository which are commonly used in classification problems. To make a complete evaluation, the proposed hybrid feature selection method is compared with the artificial bee colony optimization, and differential evolution based feature selection methods, as well as with the three most popular feature selection techniques that are information gain, chi-square, and correlation feature selection. In addition to these, the performance of the proposed method is also compared with the studies in the literature which uses the same datasets. The experimental results of this study show that our developed hybrid method is able to select good features for classification tasks to improve run-time performance and accuracy of the classifier. The proposed hybrid method may also be applied to other search and optimization problems as its performance for feature selection is better than pure artificial bee colony optimization, and differential evolution. (C) 2016 Elsevier Ltd. All rights reserved.