2017 International Conference on Computer Science and Engineering (UBMK), Antalya, Türkiye, 5 - 08 Ekim 2017, ss.366-370
The increased use of the Internet and the ease of access to online communities like social media have provided an avenue for cybercrimes. Cyberbullying, which is a kind of cybercrime, is defined as an aggressive, intentional action against a defenseless person by using the Internet, social media, or other electronic contents. Researchers have found that many of the bullying cases have tragically ended in suicides; hence automatic detection of cyberbullying has become important. The aim of this study is to detect cyberbullying on social media messages written in Turkish. To our knowledge, this is the first study which makes cyberbully detection on Turkish texts. We prepare a dataset from Instagram and Twitter messages written in Turkish and then we applied machine learning techniques that are Support Vector Machines (SVM), decision tree (C4.5), Naive Bayes Multinomial, and k Nearest Neighbors (kNN) classifiers to detect cyberbullying. We also apply information gain and chi-square feature selection methods to improve the accuracy of classifiers. We observe that when both words and emoticons in the text messages are taken into account as features, cyberbully detection improves. Among the classifiers, Naive Bayes Multinomial is the most successful one in terms both classification accuracy and running time. When feature selection is applied classification accuracy improves up to 84% for the dataset used.