Sentiment Analysis Using Learning Approaches Over Emojis for Turkish Tweets

Published in IEEE Xplore, 2018

Abstract—With the rise of the usage and interest on social media platforms, emojis have become an increasingly important part of the written language and one of the most important signals for micro-blog sentiment analysis. In this paper, we employed and evaluated classification models using two different representations based on bag-of-words and fastText to address the problem of sentiment analysis over emojis/emoticons for Turkish positive, negative and neutral tweets. At first, the bag-of-words approach is used as a simple and efficient baseline method for tweet representation, where the classifiers such as Naïve Bayes, Logistic Regression, Support Vector Machines, Decision Trees have been applied to these tweets. Secondly, we utilized fastText to represent tweets as word n-grams for sentiment analysis problem. The results show that there is no significant difference between the two models. While fastText shows 79% and the Logistic Regression classifier obtains 77% F1-score for binary classification, fastText performs 62% and Logistic Regression has 58% F1-score for multi-class classification. This study is considered as the first study that contributes to the literature by applying different vector representations such as bag-of-words and fastText to predict Turkish tweets over emojis. This study can also be utilized to predict emojis on social media context in the future.

Download paper here

Cited as:

@inproceedings{veliouglu2018sentiment,
  title={Sentiment analysis using learning approaches over emojis for Turkish tweets},
  author={Velioglu, Riza and Yildiz, Tugba and Yildirim, Savas},
  booktitle={2018 3rd International Conference on Computer Science and Engineering (UBMK)},
  pages={303--307},
  year={2018},
  organization={IEEE}
}