Mining Web Log Data for News Topic Modeling Using Latent Dirichlet Allocation

Isti Surjandari, Asma Rosyidah, Zulkarnain Zulkarnain, Enrico Laoh

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Citations (Scopus)

Abstract

The growth of e-news platforms, the most popular and accessible media for sharing information, has resulted in the increase of digital news articles volume. Users' navigation across news articles in e-news platform, which is captured in form of web log data, is able to show which articles are read by users. News articles read by users can illustrate topics of interest and public unrest towards a particular event, field, or aspect. The knowledge and understanding of topics of interest and public unrest are important, especially for subsequent newsletter journalists and government in policy-making. This study was conducted in response to the importance of extracting topics from news articles read by users or public. Latent dirichlet allocation was used as topic modeling algorithm from list of news article title and category obtained from user web log data across 5 e-news publisher domains in Indonesia. The topic modeling process results in 12 topics of news articles. The results of this study provide insight to e-news platform regarding the reading material focus of users.

Original languageEnglish
Title of host publicationProceedings - 2018 5th International Conference on Information Science and Control Engineering, ICISCE 2018
EditorsShaozi Li, Ying Dai, Yun Cheng
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages331-335
Number of pages5
ISBN (Electronic)9781538655009
DOIs
Publication statusPublished - 2 Jul 2018
Event5th International Conference on Information Science and Control Engineering, ICISCE 2018 - Zhengzhou, Henan, China
Duration: 20 Jul 201822 Jul 2018

Publication series

NameProceedings - 2018 5th International Conference on Information Science and Control Engineering, ICISCE 2018

Conference

Conference5th International Conference on Information Science and Control Engineering, ICISCE 2018
Country/TerritoryChina
CityZhengzhou, Henan
Period20/07/1822/07/18

Keywords

  • e-news
  • latent dirichlet allocation
  • topic modeling
  • web log data

Fingerprint

Dive into the research topics of 'Mining Web Log Data for News Topic Modeling Using Latent Dirichlet Allocation'. Together they form a unique fingerprint.

Cite this