TY - JOUR
T1 - Infant cry classification using CNN - RNN
AU - Nadia Maghfira, Tusty
AU - Basaruddin, T.
AU - Krisnadhi, Adila
N1 - Publisher Copyright:
© Published under licence by IOP Publishing Ltd.
PY - 2020/6/9
Y1 - 2020/6/9
N2 - The study of infant cry recognition aims to identify what an infant needs through her cry. Different crying sound can give a clue to caregivers about how to response to the infant's needs. Appropriate responses on infant cry may influence emotional, behavioral, and relational development of infant while growing up. From a pattern recognition perspective, recognizing particular needs or emotions from an infant cry is much more difficult than recognizing emotions from an adult's speech because infant cry usually does not contain verbal information. In this paper, we study the problem of classifying five different types emotion or needs expressed by infant cry, namely hunger, sleepiness, discomfort, stomachache, and indications that the infant wants to burp. We propose a novel approach using a combination of Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN) that acts as feature extraction and classifier method at once. Particularly, CNN learns salient features from raw spectrogram information and RNN learns temporal information of CNN obtained features. We also apply 5-folds cross-validation on 200 training data set and 50 validation data set. The model with the best weight is tested on 65 test set. Evaluation in Dunstan Baby Language dataset shows that our CNN-RNN model outperforms the previous method by average classification accuracy up to 94.97%. The encouraging result demonstrates that the application of CNN-RNN and 5-folds cross-validation offers accurate and robust result.
AB - The study of infant cry recognition aims to identify what an infant needs through her cry. Different crying sound can give a clue to caregivers about how to response to the infant's needs. Appropriate responses on infant cry may influence emotional, behavioral, and relational development of infant while growing up. From a pattern recognition perspective, recognizing particular needs or emotions from an infant cry is much more difficult than recognizing emotions from an adult's speech because infant cry usually does not contain verbal information. In this paper, we study the problem of classifying five different types emotion or needs expressed by infant cry, namely hunger, sleepiness, discomfort, stomachache, and indications that the infant wants to burp. We propose a novel approach using a combination of Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN) that acts as feature extraction and classifier method at once. Particularly, CNN learns salient features from raw spectrogram information and RNN learns temporal information of CNN obtained features. We also apply 5-folds cross-validation on 200 training data set and 50 validation data set. The model with the best weight is tested on 65 test set. Evaluation in Dunstan Baby Language dataset shows that our CNN-RNN model outperforms the previous method by average classification accuracy up to 94.97%. The encouraging result demonstrates that the application of CNN-RNN and 5-folds cross-validation offers accurate and robust result.
UR - http://www.scopus.com/inward/record.url?scp=85087058534&partnerID=8YFLogxK
U2 - 10.1088/1742-6596/1528/1/012019
DO - 10.1088/1742-6596/1528/1/012019
M3 - Conference article
AN - SCOPUS:85087058534
SN - 1742-6588
VL - 1528
JO - Journal of Physics: Conference Series
JF - Journal of Physics: Conference Series
IS - 1
M1 - 012019
T2 - 4th International Seminar on Sensors, Instrumentation, Measurement and Metrology, ISSIMM 2019
Y2 - 14 November 2019 through 14 November 2019
ER -