TY - GEN
T1 - Finding anchor words of separable-nonnegative matrix factorization based on singular value decomposition
AU - Novitasari, Ika Dwi
AU - Murfi, Hendri
AU - Wibowo, Arie
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/10/1
Y1 - 2017/10/1
N2 - Topic detection is a process to find topics or subjects of discussion in a collection of documents such as tweets on Twitter. Manual detection of topics on Twitter is difficult because of too many tweets. Therefore, it is necessary to detect topics automatically. One of the automatic methods for topic detection is the Separable-Nonnegative Matrix Factorization (SNMF) method with the AGM algorithm. SNMF is a matrix factorization-based model that can be solved directly using the assumption that each topic has one word, called anchor words, that is not present in other topics. SNMF with AGM algorithm consists of three stages, namely the constructing the co-occurrence matrix, finding the anchor words, and recovering the topics. The common method to find the anchor words is the convex hull-based method. In this paper, we examine the process of finding anchor words based on Singular Value Decomposition (SVD). The results show that by considering all words as anchor word candidates, the SVD-based method gives better results than the convex hull-based method. Meanwhile, when the anchor finding was done by using anchor threshold, the convex hull-based method still gives a better result than the SVD-based method.
AB - Topic detection is a process to find topics or subjects of discussion in a collection of documents such as tweets on Twitter. Manual detection of topics on Twitter is difficult because of too many tweets. Therefore, it is necessary to detect topics automatically. One of the automatic methods for topic detection is the Separable-Nonnegative Matrix Factorization (SNMF) method with the AGM algorithm. SNMF is a matrix factorization-based model that can be solved directly using the assumption that each topic has one word, called anchor words, that is not present in other topics. SNMF with AGM algorithm consists of three stages, namely the constructing the co-occurrence matrix, finding the anchor words, and recovering the topics. The common method to find the anchor words is the convex hull-based method. In this paper, we examine the process of finding anchor words based on Singular Value Decomposition (SVD). The results show that by considering all words as anchor word candidates, the SVD-based method gives better results than the convex hull-based method. Meanwhile, when the anchor finding was done by using anchor threshold, the convex hull-based method still gives a better result than the SVD-based method.
KW - Finding Anchor Words
KW - Separable Nonnegative Matrix Factorization
KW - Singular Value Decomposition
KW - Topic Detection
KW - Twitter
UR - http://www.scopus.com/inward/record.url?scp=85049743296&partnerID=8YFLogxK
U2 - 10.1109/ICICOS.2017.8276366
DO - 10.1109/ICICOS.2017.8276366
M3 - Conference contribution
AN - SCOPUS:85049743296
T3 - Proceedings - 2017 1st International Conference on Informatics and Computational Sciences, ICICoS 2017
SP - 225
EP - 229
BT - Proceedings - 2017 1st International Conference on Informatics and Computational Sciences, ICICoS 2017
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 1st International Conference on Informatics and Computational Sciences, ICICoS 2017
Y2 - 15 November 2017 through 16 November 2017
ER -