Abstract
Emoji is a picture character used in social media to express emotion of a text message. With the increasing use of emoji few who study the relationship between emoji and text. Due to diversity of emoji and the similarity meaning between emoji, emoji classification task is more relative complex than common text classification task. In this paper, we build a computational model by extracted various features namely: linguistic feature, semantic feature, and lexicon feature to improve emoji classification performance. Then we train 400k tweet using two different classifiers Stochastic Gradient Descent Classifier and Logistic Regression. The experiment showed that our proposed feature using Logistic Regression outperformed the baseline.
Original language | English |
---|---|
Pages (from-to) | 194-201 |
Number of pages | 8 |
Journal | Procedia Computer Science |
Volume | 135 |
DOIs | |
Publication status | Published - 2018 |
Event | 3rd International Conference on Computer Science and Computational Intelligence, ICCSCI 2018 - Tangerang, Indonesia Duration: 7 Sept 2018 → 8 Sept 2018 |
Keywords
- emoji
- lexicon
- n-gram
- word-embedding