Named Entity Recognition for the Indonesian language: Combining contextual, morphological and part-of-speech features into a knowledge engineering approach

Indra Budi, Stéphane Bressan, Gatot Wahyudi, Zainal Arifin Hasibuan, Bobby Achirul Awal Nazief

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

28 Citations (Scopus)

Abstract

We present a novel named entity recognition approach for the Indonesian language. We call the new method InNER for Indonesian Named Entity Recognition. InNER is based on a set of rules capturing the contextual, morphological, and part of speech knowledge necessary in the process of recognizing named entities in Indonesian texts. The InNER strategy is one of knowledge engineering: the domain and language specific rules are designed by expert knowledge engineers. After showing in our previous work that mined association rules can effectively recognize named entities and outperform maximum entropy methods, we needed to evaluate the potential for improvement to the rule based approach when expert crafted knowledge is used. The results are conclusive: the InNER method yields recall and precision of up to 63.43% and 71.84%, respectively. Thus, it significantly outperforms not only maximum entropy methods but also the association rule based method we had previously designed.

Original languageEnglish
Title of host publicationDiscovery Science - 8th International Conference, DS 2005, Proceedings
Pages57-69
Number of pages13
DOIs
Publication statusPublished - 1 Dec 2005
Event8th International Conference on Discovery Science, DS 2005 - , Singapore
Duration: 8 Oct 200511 Oct 2005

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume3735 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference8th International Conference on Discovery Science, DS 2005
Country/TerritorySingapore
Period8/10/0511/10/05

Fingerprint

Dive into the research topics of 'Named Entity Recognition for the Indonesian language: Combining contextual, morphological and part-of-speech features into a knowledge engineering approach'. Together they form a unique fingerprint.

Cite this