TY - GEN
T1 - Kernelized Eigenspace based fuzzy C-means for sensing trending topics on twitter
AU - Prakoso, Yudho
AU - Murfi, Hendri
AU - Wibowo, Arie
N1 - Funding Information:
This work was supported by Universitas Indonesia under PITTA 2018 grant. Any opinions, findings, and conclusions or recommendations are the authors' and do not necessarily reflect those of the sponsor.
Publisher Copyright:
© 2018 Association for Computing Machinery.
PY - 2018/7/20
Y1 - 2018/7/20
N2 - One of the automated methods for textual data analysis is topic detection. Eigenspace-based fuzzy c-means (EFCM) is a soft clustering-based method for topic detection. Firstly, EFCM uses truncated singular value decomposition to transform high dimensional textual data into low dimensional textual data. Next, the clustering process is conducted in the lower dimensional space. However, that transformation process may eliminate some important features from the textual data. Therefore, the accuracy may be reduced. In this paper, we use kernel trick to overcome that weakness so that the clustering process is performed in a higher dimensional space without explicitly transforming the textual data to space. Our simulations show that this approach improves the accuracies of EFCM in term of topic recall for the problem of sensing trending topic on Twitter.
AB - One of the automated methods for textual data analysis is topic detection. Eigenspace-based fuzzy c-means (EFCM) is a soft clustering-based method for topic detection. Firstly, EFCM uses truncated singular value decomposition to transform high dimensional textual data into low dimensional textual data. Next, the clustering process is conducted in the lower dimensional space. However, that transformation process may eliminate some important features from the textual data. Therefore, the accuracy may be reduced. In this paper, we use kernel trick to overcome that weakness so that the clustering process is performed in a higher dimensional space without explicitly transforming the textual data to space. Our simulations show that this approach improves the accuracies of EFCM in term of topic recall for the problem of sensing trending topic on Twitter.
KW - Clustering
KW - Fuzzy C-Means
KW - Kernel Trick
KW - Singular Value Decomposition
KW - Topic Detection
KW - Topic Modeling
UR - http://www.scopus.com/inward/record.url?scp=85055705919&partnerID=8YFLogxK
U2 - 10.1145/3239283.3239297
DO - 10.1145/3239283.3239297
M3 - Conference contribution
AN - SCOPUS:85055705919
T3 - ACM International Conference Proceeding Series
SP - 6
EP - 10
BT - Proceedings of the 2018 International Conference on Data Science and Information Technology, DSIT 2018
PB - Association for Computing Machinery
T2 - 2018 International Conference on Data Science and Information Technology, DSIT 2018
Y2 - 20 July 2018 through 22 July 2018
ER -