A Semi-supervised Algorithm for Indonesian Named Entity Recognition

Rezka Aufar Leonandya, Bayu Distiawan, Nursidik Heru Praptono

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

7 Citations (Scopus)

Abstract

Named Entity Recognition or NER is one of the sub-research field of Information Extraction which can be used for machine translation, question answering, semantic web, etc. One of the biggest challenge of NER is the adversity to construct a manually labeled training data. In this work, we present a semi-supervised approach for Indonesian language NER which is capable of creating high quality training data automatically. Semi-supervised approach works by utilizing unlabeled data made from Wikipedia and DBPedia to form high accuracy and non-redundant additional training data for each iteration of semi-supervised process. We show that our system manages to generate new training data and gain an increasing F1 score as the iteration of semi-supervised process goes.

Original languageEnglish
Title of host publicationProceedings - 2015 3rd International Symposium on Computational and Business Intelligence, ISCBI 2015
EditorsSimon Fong, Suash Deb, Leo Willyanto Santoso
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages45-50
Number of pages6
ISBN (Electronic)9781467385015
DOIs
Publication statusPublished - 14 Jan 2016
Event3rd International Symposium on Computational and Business Intelligence, ISCBI 2015 - Bali, Indonesia
Duration: 7 Dec 20159 Dec 2015

Publication series

NameProceedings - 2015 3rd International Symposium on Computational and Business Intelligence, ISCBI 2015

Conference

Conference3rd International Symposium on Computational and Business Intelligence, ISCBI 2015
Country/TerritoryIndonesia
CityBali
Period7/12/159/12/15

Keywords

  • dbpedia
  • named entity recognition
  • semisupervised
  • stanford ner
  • wikipedia

Fingerprint

Dive into the research topics of 'A Semi-supervised Algorithm for Indonesian Named Entity Recognition'. Together they form a unique fingerprint.

Cite this