The accuracy of fuzzy C-means in lower-dimensional space for topic detection

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Citations (Scopus)

Abstract

Topic detection is an automatic method to discover topics in textual data. The standard methods of the topic detection are nonnegative matrix factorization (NMF) and latent Dirichlet allocation (LDA). Another alternative method is a clustering approach such as a k-means and fuzzy c-means (FCM). FCM extend the k-means method in the sense that the textual data may have more than one topic. However, FCM works well for low-dimensional textual data and fails for high-dimensional textual data. An approach to overcome the problem is transforming the textual data into lower dimensional space, i.e., Eigenspace, and called Eigenspace-based FCM (EFCM). Firstly, the textual data are transformed into an Eigenspace using truncated singular value decomposition. FCM is performed on the eigenspace data to identify the memberships of the textual data in clusters. Using these memberships, we generate topics from the high dimensional textual data in the original space. In this paper, we examine the accuracy of EFCM for topic detection. Our simulations show that EFCM results in the accuracies between the accuracies of LDA and NMF regarding both topic interpretation and topic recall.

Original languageEnglish
Title of host publicationSmart Computing and Communication - 3rd International Conference, SmartCom 2018, Proceedings
EditorsMeikang Qiu
PublisherSpringer Verlag
Pages321-334
Number of pages14
ISBN (Print)9783030057541
DOIs
Publication statusPublished - 1 Jan 2018
Event3rd International Conference on Smart Computing and Communications, SmartCom 2018 - Tokyo, Japan
Duration: 10 Dec 201812 Dec 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11344 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference3rd International Conference on Smart Computing and Communications, SmartCom 2018
CountryJapan
CityTokyo
Period10/12/1812/12/18

Keywords

  • Accuracy
  • Clustering
  • Eigenspace
  • Fuzzy c-means
  • Topic detection

Fingerprint Dive into the research topics of 'The accuracy of fuzzy C-means in lower-dimensional space for topic detection'. Together they form a unique fingerprint.

Cite this