Determining subject headings of documents using information retrieval models

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

Subject heading is a controlled vocabulary that describes the topic of a document, which is important to find and organize library resources. Assigning appropriate subject headings to a document, however, is a time-consuming process. We therefore conduct a novel study on the effectiveness of information retrieval models, i.e., language model (LM) and vector space model (VSM), to automatically generate a ranked list of relevant subject headings, with the aim to give a recommendation for librarians to determine the subject headings effectively and efficiently. Our results show that there are a high number of our queries (up to 61%) that have relevant subject headings in the ten top-ranked recommendations; and on average, the first relevant subject heading is found at the early position (3rd rank). This indicates that document retrieval methods can help the subject heading assignment process. LM and VSM are shown to have comparable performance, except when the search unit is title, VSM is superior to LM by 8-22%. Our further analysis exhibits three faculty pairs that are potential to have research collaboration as their students' thesis often have overlap subject headings: i) economy and business-social and political sciences, ii) nursing-public health, and iii) medicine-public health.

Original languageEnglish
Pages (from-to)1049-1058
Number of pages10
JournalIndonesian Journal of Electrical Engineering and Computer Science
Volume23
Issue number2
DOIs
Publication statusPublished - Aug 2021

Keywords

  • Document retrieval
  • Information retrieval
  • Language model
  • Subject heading
  • Vector space model

Fingerprint

Dive into the research topics of 'Determining subject headings of documents using information retrieval models'. Together they form a unique fingerprint.

Cite this