Randomspace-Based Fuzzy C-Means for Topic Detection on Indonesia Online News

Muhammad Rifky Yusdiansyah, Hendri Murfi, Arie Wibowo

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Citation (Scopus)

Abstract

Topic detection is a process used to analyze words in a collection of textual data to determine the topics in the collection, how they relate to each other, and how they change from time to time. Fuzzy C-Means (FCM) and Kernel-based Fuzzy C-Means (KFCM) method are clustering method that is often used in topic detection problems. Both FCM and KFCM can group dataset into multiple clusters on a low-dimensional dataset, but fail on high-dimensional dataset. To overcome this problem, dimension reduction is carried out on the dataset before topic detection is carried out using the FCM or KFCM method. In this study, the national news account’s tweets dataset on Twitter were used for topic detection using the Randomspace-based Fuzzy C-Means (RFCM) method and Kernelized Randomspace-based Fuzzy C-Means (KRFCM) method. The RFCM and KRFCM learning methods are divided into two steps, which are reducing the dimension of the dataset into a lower-dimensional dataset using random projection and conducting the FCM learning method on the RFCM and the KFCM learning method on KRFCM. After obtaining the topics, then an evaluation is carried out by calculating the coherence value on the topics. The coherence value used in this study uses the Pointwise Mutual Information (PMI) unit. The study was conducted by comparing the average PMI values of RFCM and KRFCM with Eigenspace-based Fuzzy C-Means (EFCM) and Kernelized Eigenspace-based Fuzzy C-Means (KRFCM). The results obtained using national news account’s tweets showed that the RFCM and KRFCM methods offered faster running time for a dimensional reduction but had smaller average PMI values compared to the average PMI values generated by the EFCM and KEFCM learning methods.

Original languageEnglish
Title of host publicationMulti-disciplinary Trends in Artificial Intelligence - 13th International Conference, MIWAI 2019, Proceedings
EditorsRapeeporn Chamchong, Kok Wai Wong
PublisherSpringer
Pages133-143
Number of pages11
ISBN (Print)9783030337087
DOIs
Publication statusPublished - 1 Jan 2019
Event13th Multi-disciplinary International Conference on Artificial Intelligence, MIWAI 2019 - Kuala Lumpur, Malaysia
Duration: 17 Nov 201919 Nov 2019

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11909 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference13th Multi-disciplinary International Conference on Artificial Intelligence, MIWAI 2019
Country/TerritoryMalaysia
CityKuala Lumpur
Period17/11/1919/11/19

Keywords

  • Fuzzy C-Means
  • Random projection
  • Topic detection
  • Twitter

Fingerprint

Dive into the research topics of 'Randomspace-Based Fuzzy C-Means for Topic Detection on Indonesia Online News'. Together they form a unique fingerprint.

Cite this