Adjusted TextRank for keyword extraction in petrochemical project correspondence documents

Indri Atmoko, Evi Yulianti, Meganingrum Arista Jiwanggi

Research output: Contribution to journalArticlepeer-review

Abstract

A large petrochemical construction project is typically executed by multiple parties, all bound by contract agreement. During the execution phase, issues and problems may arise because the work details are not clearly specified in the contractual agreement. These issues are formally communicated and documented through written correspondence letters. By identifying important keywords within these formal letters, a comprehensive narrative of the project, including its associated issues, can be identified and analyzed. In this research, we introduce an adjusted TextRank algorithm that integrates external features from the Indonesian FastText language model and term frequency-inverse document frequency (TF-IDF) scores to identify important keywords within a dataset of correspondence letters of petrochemical projects. This enhancement involves refining phrase detection, semantic relationship estimation between words, and part-of-speech (POS) identification for words or phrases. Our results show that the proposed adjustments result in improved evaluation scores compared to the baseline standard TextRank and standard TF-IDF, respectively by 24.1% and 25% in terms of F-1 scores.

Original languageEnglish
Pages (from-to)1171-1180
Number of pages10
JournalIndonesian Journal of Electrical Engineering and Computer Science
Volume35
Issue number2
DOIs
Publication statusPublished - Aug 2024

Keywords

  • Bahasa Indonesia
  • Keyword extraction
  • Phrase detection
  • Project management
  • TextRank

Fingerprint

Dive into the research topics of 'Adjusted TextRank for keyword extraction in petrochemical project correspondence documents'. Together they form a unique fingerprint.

Cite this