We Know You Are Living in Bali: Location Prediction of Twitter Users Using BERT Language Model

Lihardo Faisal Simanjuntak, Rahmad Mahendra, Evi Yulianti

Research output: Contribution to journalArticlepeer-review

Abstract

Twitter user location data provide essential information that can be used for various purposes. However, user location is not easy to identify because many profiles omit this information, or users enter data that do not correspond to their actual locations. Several related works attempted to predict location on English-language tweets. In this study, we attempted to predict the location of Indonesian tweets. We utilized machine learning approaches, i.e., long-short term memory (LSTM) and bidirectional encoder representations from transformers (BERT) to infer Twitter users’ home locations using display name in profile, user description, and user tweets. By concatenating display name, description, and aggregated tweet, the model achieved the best accuracy of 0.77. The performance of the IndoBERT model outperformed several baseline models.

Original languageEnglish
Article number77
JournalBig Data and Cognitive Computing
Volume6
Issue number3
DOIs
Publication statusPublished - Sep 2022

Keywords

  • BERT
  • Indonesian
  • location
  • prediction
  • Twitter

Fingerprint

Dive into the research topics of 'We Know You Are Living in Bali: Location Prediction of Twitter Users Using BERT Language Model'. Together they form a unique fingerprint.

Cite this