Accuracy of separable nonnegative matrix factorization for topic extraction

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Citations (Scopus)

Abstract

Topic extraction is an automatic method to extract topics in textual data. The popular method of topic extraction is latent Dirichlet allocation (LDA) which is a probabilistic topic model. Because of some limitations of learning the model parameters, e.g. NP-hard, several researchers continue the work to design methods with polynomial complexities. The developing alternative approach is the nonnegative matrix factorization (NMF) based method. Under a separability assumption, a direct method that runs in polynomial time is proposed. In general, this algorithm works in three steps: first, generating a word cooccurrence matrix, choosing anchor words for each topic, and then in the recovery step, it directly reconstructs the topics given the anchor words. In this paper, we examine the accuracy of the separable nonnegative matrix factorization (SNMF). Firstly the accuracy of SNMF is strongly influenced by the anchor words. In this case, the accuracy of SNMF is significantly improved when we find the anchr words in Eigenspace, instead of random space. Moreover, SNMF gives the higher accuracy than LDA, however, the lower accuracy than NMF.

Original languageEnglish
Title of host publicationProceedings of the 3rd International Conference on Communication and Information Processing, ICCIP 2017
PublisherAssociation for Computing Machinery
Pages226-230
Number of pages5
ISBN (Electronic)9781450353656
DOIs
Publication statusPublished - 24 Nov 2017
Event3rd International Conference on Communication and Information Processing, ICCIP 2017 - Tokyo, Japan
Duration: 24 Nov 201726 Nov 2017

Publication series

NameACM International Conference Proceeding Series

Conference

Conference3rd International Conference on Communication and Information Processing, ICCIP 2017
Country/TerritoryJapan
CityTokyo
Period24/11/1726/11/17

Keywords

  • Eigenspace
  • Separable nonnegative matrix factorization
  • Singular value decomposition
  • Topic extraction

Fingerprint

Dive into the research topics of 'Accuracy of separable nonnegative matrix factorization for topic extraction'. Together they form a unique fingerprint.

Cite this