TY - JOUR
T1 - Word recognition and automated epenthesis removal for Indonesian sign system sentence gestures
AU - Rakun, Erdefi
AU - Gusti Bagus Hadi Widhinugraha, I.
AU - Setyono, Noer Fitria Putra
N1 - Funding Information:
This work is supported by the computing facilities at the Tokopedia-UI AI Center of Excellence. This work is supported by Universitas Indonesia’s Research Grant PUTI Q3 NKB-1832/UN2.RST/HKP.05.00/2020. This support is gratefully received and acknowledged. The authors also wish to thank Lim Yohanes Stefanus PhD and M. I. Mas M.Kom for the final proofreading.
Publisher Copyright:
© 2022 Institute of Advanced Engineering and Science. All rights reserved.
PY - 2022/6
Y1 - 2022/6
N2 - This research focuses on building a system to translate continuous Indonesian sign system (SIBI) gestures into text. In a continuous gesture, a signer will add an epenthesis (transitional) gesture, which is hand movement with no meaning but needed to connect the hand movement of one word with the next word in a continuous gesture. Reducing the number of irrelevant inputs to the model through automated epenthesis removal can improve the system's ability to recognize the words in continuous gestures. We implemented threshold conditional random fields (TCRF) to identify epenthesis gestures. The dataset consists of 2,255 videos representing 28 common sentences in SIBI. The translation system consists of MobileNetV2 as a feature extraction technique, removing epenthesis gestures found by the TCRF, and a long short-term memory (LSTM) for the classifier. With the MobileNetV2-TCRF-bidirectional LSTM model, the best word error rate (WER) and sentence accuracy (SAcc) were 33.4% and 16.2%, respectively. Intermediate-stage processing steps consisting of sandwiched majority voting of the TCRF and the removal of word labels whose number of frames is less than two frames, along with LSTM output grouping, were able to reduce WER from 33.4% to 3.4% and increase SAcc from 16.2% to 80.2%
AB - This research focuses on building a system to translate continuous Indonesian sign system (SIBI) gestures into text. In a continuous gesture, a signer will add an epenthesis (transitional) gesture, which is hand movement with no meaning but needed to connect the hand movement of one word with the next word in a continuous gesture. Reducing the number of irrelevant inputs to the model through automated epenthesis removal can improve the system's ability to recognize the words in continuous gestures. We implemented threshold conditional random fields (TCRF) to identify epenthesis gestures. The dataset consists of 2,255 videos representing 28 common sentences in SIBI. The translation system consists of MobileNetV2 as a feature extraction technique, removing epenthesis gestures found by the TCRF, and a long short-term memory (LSTM) for the classifier. With the MobileNetV2-TCRF-bidirectional LSTM model, the best word error rate (WER) and sentence accuracy (SAcc) were 33.4% and 16.2%, respectively. Intermediate-stage processing steps consisting of sandwiched majority voting of the TCRF and the removal of word labels whose number of frames is less than two frames, along with LSTM output grouping, were able to reduce WER from 33.4% to 3.4% and increase SAcc from 16.2% to 80.2%
KW - Epenthesis gesture
KW - Long short-term memory
KW - SIBI
KW - Sign language recognition
KW - Threshold conditional random field
UR - http://www.scopus.com/inward/record.url?scp=85131639858&partnerID=8YFLogxK
U2 - 10.11591/ijeecs.v26.i3.pp1402-1414
DO - 10.11591/ijeecs.v26.i3.pp1402-1414
M3 - Article
AN - SCOPUS:85131639858
SN - 2502-4752
VL - 26
SP - 1402
EP - 1414
JO - Indonesian Journal of Electrical Engineering and Computer Science
JF - Indonesian Journal of Electrical Engineering and Computer Science
IS - 3
ER -