TY - GEN
T1 - Analysis of non-negative double singular value decomposition initialization method on eigenspace-based fuzzy C-Means algorithm for Indonesian online news topic detection
AU - Sutrisman, Raden Trivan
AU - Murfi, Hendri
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/11/8
Y1 - 2018/11/8
N2 - The rapid increasing of online news in Indonesia creates the need for news analysis to obtain information as fast as possible. Topics are basic components that are often used to analyze data in the textual forms, such as the news article. By using topic modeling, topics can be detected automatically on large news documents which are difficult to perform manually. One of the topic modeling that can be used is the clustering-based method, i.e., Eigenspace-based Fuzzy C-Means (EFCM). The common initialization method of EFCM is random. However, this random initialization usually produces different topics for each run. Therefore, we consider Non-Negative Double Singular Value Decomposition (NNDSVD) as an initialization method of EFCM. Besides the advantage of non-randomness, our simulations show that the NNDSVD method gives better accuracies in term of interpretability score than the random method.
AB - The rapid increasing of online news in Indonesia creates the need for news analysis to obtain information as fast as possible. Topics are basic components that are often used to analyze data in the textual forms, such as the news article. By using topic modeling, topics can be detected automatically on large news documents which are difficult to perform manually. One of the topic modeling that can be used is the clustering-based method, i.e., Eigenspace-based Fuzzy C-Means (EFCM). The common initialization method of EFCM is random. However, this random initialization usually produces different topics for each run. Therefore, we consider Non-Negative Double Singular Value Decomposition (NNDSVD) as an initialization method of EFCM. Besides the advantage of non-randomness, our simulations show that the NNDSVD method gives better accuracies in term of interpretability score than the random method.
KW - Eigenspace
KW - Fuzzy c-means
KW - Initialization
KW - Topic detection
UR - http://www.scopus.com/inward/record.url?scp=85058411818&partnerID=8YFLogxK
U2 - 10.1109/ICoICT.2018.8528791
DO - 10.1109/ICoICT.2018.8528791
M3 - Conference contribution
AN - SCOPUS:85058411818
T3 - 2018 6th International Conference on Information and Communication Technology, ICoICT 2018
SP - 55
EP - 60
BT - 2018 6th International Conference on Information and Communication Technology, ICoICT 2018
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 6th International Conference on Information and Communication Technology, ICoICT 2018
Y2 - 3 May 2018 through 4 May 2018
ER -