TY - GEN
T1 - Monitoring trending topics of real-world events on Indonesian tweets using fuzzy C-means in lower dimensional space
AU - Murfi, Hendri
N1 - Funding Information:
This work was supported by Universitas Indonesia under PDUPT 2019 grant. Any opinions, findings, and conclusions or recommendations are the authors’ and do not necessarily reflect those of the sponsor.
Publisher Copyright:
© 2019 ACM.
PY - 2019/10/26
Y1 - 2019/10/26
N2 - Topic detection is an automatic method to extract topics in textual data, i.e., trending topic in social media. One of the recent topic detection methods is Eigenspace-based Fuzzy C-Means, which is a soft clustering-based topic detection method. In this method, the textual data are transformed into a lower-dimensional Eigenspace using truncated singular value decomposition. Fuzzy C-Means is performed on the Eigenspace to identify the memberships of each textual data to each cluster. Using these memberships, we extract the topics from textual data on the original space. In this paper, we use another approach to extract the topics by transforming back the centroids of the clusters into the positive subspace of the original space. Our simulations show that this new approach improves the old one regarding the topic interpretability in term of the coherence score. Moreover, this Eigenspace-based Fuzzy CMeans becomes better than both standard methods, i.e., nonnegative matrix factorization and latent Dirichlet allocation.
AB - Topic detection is an automatic method to extract topics in textual data, i.e., trending topic in social media. One of the recent topic detection methods is Eigenspace-based Fuzzy C-Means, which is a soft clustering-based topic detection method. In this method, the textual data are transformed into a lower-dimensional Eigenspace using truncated singular value decomposition. Fuzzy C-Means is performed on the Eigenspace to identify the memberships of each textual data to each cluster. Using these memberships, we extract the topics from textual data on the original space. In this paper, we use another approach to extract the topics by transforming back the centroids of the clusters into the positive subspace of the original space. Our simulations show that this new approach improves the old one regarding the topic interpretability in term of the coherence score. Moreover, this Eigenspace-based Fuzzy CMeans becomes better than both standard methods, i.e., nonnegative matrix factorization and latent Dirichlet allocation.
KW - Clustering
KW - Eigenspace
KW - Fuzzy c-means
KW - Topic detection
KW - Topic monitoring
KW - Twitter
UR - http://www.scopus.com/inward/record.url?scp=85079085047&partnerID=8YFLogxK
U2 - 10.1145/3369114.3369127
DO - 10.1145/3369114.3369127
M3 - Conference contribution
AN - SCOPUS:85079085047
T3 - ACM International Conference Proceeding Series
SP - 82
EP - 85
BT - ICAAI 2019 - 2019 the 3rd International Conference on Advances in Artificial Intelligence
PB - Association for Computing Machinery
T2 - 3rd International Conference on Advances in Artificial Intelligence, ICAAI 2019
Y2 - 26 October 2019 through 28 October 2019
ER -