Using statistical term similarity for sense disambiguation in cross-language information retrieval

Research output: Contribution to journalArticlepeer-review

42 Citations (Scopus)

Abstract

With the increasing availability of machine-readable bilingual dictionaries, dictionary-based automatic query translation has become a viable approach to Cross-Language Information Retrieval (CLIR). In this approach, resolving term ambiguity is a crucial step. We propose a sense disambiguation technique based on a term-similarity measure for selecting the right translation sense of a query term. In addition, we apply a query expansion technique which is also based on the term similarity measure to improve the effectiveness of the translation queries. The results of our Indonesian to English and English to Indonesian CLIR experiments demonstrate the effectiveness of the sense disambiguation technique. As for the query expansion technique, it is shown to be effective as long as the term ambiguity in the queries has been resolved. In the effort to solve the term ambiguity problem, we discovered that differences in the pattern of word-formation between the two languages render query translations from one language to the other difficult.

Original languageEnglish
Pages (from-to)69-80
Number of pages12
JournalInformation Retrieval
Volume2
Issue number1
Publication statusPublished - 1 Dec 2000

Keywords

  • Cross-language information retrieval
  • Term disambiguation

Fingerprint Dive into the research topics of 'Using statistical term similarity for sense disambiguation in cross-language information retrieval'. Together they form a unique fingerprint.

Cite this