A New Feature Selection Method for Sentiment Analysis of Turkish Reviews


International Symposium on Innovations in Intelligent Systems and Applications (INISTA), Sinaia, Romanya, 2 - 05 Ağustos 2016 identifier identifier


Sentiment analysis identifies people's opinions, sentiments about a product, a service, an organization, or an event. Because of huge review documents, researchers explore different feature selection methods that aim to eliminate non valuable features. However, not much work has been done on feature selection methods for sentiment analysis of Turkish reviews. In this study, we propose a new feature selection method called Query Expansion Ranking that is based on query expansion term weighting methods, which are used in Information Retrieval domain to determine the most valuable terms for query expansion. We compare Query Expansion Ranking with Chi Square method, which is a well-known and successful feature selector, and Document Frequency Difference which is a feature selection method proposed for sentiment analysis of English reviews. Experiments are conducted on four Turkish product review datasets that are book, DVDs, electronics, and kitchen appliances reviews by using a supervised machine learning classification method, namely Naive Bayes Multinomial classifier. We show that our new proposed method improves sentiment analysis performance in terms of classification accuracy and time. In the experimental evaluation, we also show that our new feature selector improves classification accuracy better than Chi Square, and Document Frequency Difference methods.