QER: a new feature selection method for sentiment analysis


Creative Commons License

PARLAR T., ÖZEL S. A., Song F.

HUMAN-CENTRIC COMPUTING AND INFORMATION SCIENCES, cilt.8, 2018 (SCI-Expanded) identifier identifier

Özet

Sentiment analysis is about the classification of sentiments expressed in review documents. In order to improve the classification accuracy, feature selection methods are often used to rank features so that non-informative and noisy features with low ranks can be removed. In this study, we propose a new feature selection method, called query expansion ranking, which is based on query expansion term weighting methods from the field of information retrieval. We compare our proposed method with other widely used feature selection methods, including Chi square, information gain, document frequency difference, and optimal orthogonal centroid, using four classifiers: na < ve Bayes multinomial, support vector machines, maximum entropy modelling, and decision trees. We test them on movie and multiple kinds of product reviews for both Turkish and English languages so that we can show their performances for different domains, languages, and classifiers. We observe that our proposed method achieves consistently better performance than other feature selection methods, and query expansion ranking, Chi square, information gain, document frequency difference methods tend to produce better results for both the English and Turkish reviews when tested using na < ve Bayes multinomial classifier.