Cross-Lingual Summarization: English - Bahasa Indonesia

Achmad F. Abka, Mahardhika Pratama, Wisnu Jatmiko

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Citation (Scopus)

Abstract

Progress of abstractive summarization has been accelerated since the introduction of sequence-to-sequence neural networks. Summarization is no longer limited to selecting words or sentences that exist in the source document as in the extractive approach, but can generate completely new words or sentences that have never appeared in the source document. Big push came from machine translation research with the introduction of attention mechanisms. Attention mechanism is the key to the information bottleneck problem in encoder-decoder model. Cross-Lingual Summarization (CLS) is the task of generating a summary in target language from source document in different language. Traditional methods split this task into two steps: summarization and translation. This paper describes a study on CLS without explicitly using translator, thereby reducing one step as in existing method. We incorporate multilingual embeddings in sequence-to-sequence neural networks with attention mechanisms to handle this task. Multilingual embeddings are used to represent words as if the source language and the target language are the same language. Experiments show comparable performance between monolingual summarization and cross-lingual summarization in Amazon Fine Food review data indicated by ROUGE scores which are only 1-2 points apart.

Original languageEnglish
Title of host publicationProceedings - IWBIS 2021
Subtitle of host publication6th International Workshop on Big Data and Information Security
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages53-58
Number of pages6
ISBN (Electronic)9781665424516
DOIs
Publication statusPublished - 2021
Event6th International Workshop on Big Data and Information Security, IWBIS 2021 - Virtual, Online, Indonesia
Duration: 23 Oct 202126 Oct 2021

Publication series

NameProceedings - IWBIS 2021: 6th International Workshop on Big Data and Information Security

Conference

Conference6th International Workshop on Big Data and Information Security, IWBIS 2021
Country/TerritoryIndonesia
CityVirtual, Online
Period23/10/2126/10/21

Keywords

  • attention mechanism
  • cross-lingual summarization
  • multilingual word embeddings
  • sequence-to-sequence neural network

Fingerprint

Dive into the research topics of 'Cross-Lingual Summarization: English - Bahasa Indonesia'. Together they form a unique fingerprint.

Cite this