Analysis and implementation measurement of semantic similarity using content management information on WordNet

Tommy Wijaya Sagala, Theresia Wati, Solikin, Nur Fitriah Ayuning Budi, Achmad Nizar Hidayanto

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Citation (Scopus)

Abstract

In natural language processing (NLP), measuring semantic similarity plays an important role. The results of these measurements are often used as the basis for performing natural language processing tasks such as question answering, document classification, machine translation, and so on. This paper analyses the test results using the latest dataset on the implementation of content management utilization on WordNet in the form of taxonomy in measuring semantic similarity values. Further implementation results are compared with Gold Standard datasets for measured performance. The dataset used for testing is SimLex-999. In performance measurement, Pearson Correlation and Spearman Correlation are used. The use of these two correlations because each correlation has several advantages and disadvantages. Based on the test results, Seco Formula resulted in Pearson Correlation and Spearman Correlation of 0.583 and 0.582 respectively. While New Formula resulted in Pearson Correlation and Spearman Correlation respectively of 0.602 and 0.594. The correlation results show strong positive correlation relationship. Therefore, the method of information content in WordNet is feasible to be used to measure the value of semantic similarity.

Original languageEnglish
Title of host publication2018 International Conference on Advanced Computer Science and Information Systems, ICACSIS 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages337-342
Number of pages6
ISBN (Electronic)9781728101354
DOIs
Publication statusPublished - 2 Jul 2018
Event10th International Conference on Advanced Computer Science and Information Systems, ICACSIS 2018 - Yogyakarta, Indonesia
Duration: 27 Oct 201828 Oct 2018

Publication series

Name2018 International Conference on Advanced Computer Science and Information Systems, ICACSIS 2018

Conference

Conference10th International Conference on Advanced Computer Science and Information Systems, ICACSIS 2018
Country/TerritoryIndonesia
CityYogyakarta
Period27/10/1828/10/18

Keywords

  • Gold standard
  • Natural language processing
  • Pearson correlation
  • Semantic similarity
  • Spearman correlation

Fingerprint

Dive into the research topics of 'Analysis and implementation measurement of semantic similarity using content management information on WordNet'. Together they form a unique fingerprint.

Cite this