Extracting disease-symptom relationships from health question and answer forum

Christian Halim, Alfan Farizki Wicaksono, Mirna Adriani

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Citations (Scopus)

Abstract

In this paper, we address the problem of automatically extracting disease-symptom relationships from health question-answer forums due to its usefulness for medical question answering system. To cope with the problem, we divide our main task into two subtasks since they exhibit different challenges: (1) disease-symptom extraction across sentences, (2) disease-symptom extraction within a sentence. For both subtasks, we employed machine learning approach leveraging several hand-crafted features, such as syntactic features (i.e., information from part-of-speech tags) and pre-trained word vectors. Furthermore, we basically formulate our problem as a binary classification task, in which we classify the 'indicating' relation between a pair of Symptom and Disease entity. To evaluate the performance, we also collected and annotated corpus containing 463 pairs of question-answer threads from several Indonesian health consultation websites. Our experiment shows that, as our expected, the first subtask is relatively more difficult than the second subtask. For the first subtask, the extraction of disease-symptom relation only achieved 36% in terms of F1 measure, while the second one was 76%. To the best of our knowledge, this is the first work addressing such relation extraction task for both 'across' and 'within' sentence, especially in Indonesia.

Original languageEnglish
Title of host publicationProceedings of the 2017 International Conference on Asian Language Processing, IALP 2017
EditorsRong Tong, Yue Zhang, Yanfeng Lu, Minghui Dong
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages87-90
Number of pages4
ISBN (Electronic)9781538619803
DOIs
Publication statusPublished - 2 Jul 2017
Event21st International Conference on Asian Language Processing, IALP 2017 - Singapore, Singapore
Duration: 5 Dec 20177 Dec 2017

Publication series

NameProceedings of the 2017 International Conference on Asian Language Processing, IALP 2017
Volume2018-January

Conference

Conference21st International Conference on Asian Language Processing, IALP 2017
Country/TerritorySingapore
CitySingapore
Period5/12/177/12/17

Keywords

  • Indonesian Language
  • Information Extraction
  • Machine Learning
  • Natural Language
  • Question Answering System
  • Relation Extraction

Fingerprint

Dive into the research topics of 'Extracting disease-symptom relationships from health question and answer forum'. Together they form a unique fingerprint.

Cite this