Infant Cry Signal Diagnostic System Using Deep Learning and Fused Features

Zayed, Yara$AAUP$Palestinian; Hasasneh, Ahmad$AAUP$Palestinian; Tadj, Chakib$Other$Other

Please use this identifier to cite or link to this item: http://repository.aaup.edu/jspui/handle/123456789/1708

Full metadata record

DC Field	Value	Language
dc.contributor.author	Zayed, Yara$AAUP$Palestinian	-
dc.contributor.author	Hasasneh, Ahmad$AAUP$Palestinian	-
dc.contributor.author	Tadj, Chakib$Other$Other	-
dc.date.accessioned	2023-10-05T07:51:42Z	-
dc.date.available	2023-10-05T07:51:42Z	-
dc.date.issued	2023-06-19	-
dc.identifier.citation	Zayed, Y.; Hasasneh, A.; Tadj, C. Infant Cry Signal Diagnostic System Using Deep Learning and Fused Features. Diagnostics 2023, 13, 2107. https://doi.org/10.3390/diagnostics13122107	en_US
dc.identifier.uri	http://repository.aaup.edu/jspui/handle/123456789/1708	-
dc.description.abstract	Early diagnosis of medical conditions in infants is crucial for ensuring timely and effective treatment. However, infants are unable to verbalize their symptoms, making it difficult for healthcare professionals to accurately diagnose their conditions. Crying is often the only way for infants to communicate their needs and discomfort. In this paper, we propose a medical diagnostic system for interpreting infants’ cry audio signals (CAS) using a combination of different audio domain features and deep learning (DL) algorithms. The proposed system utilizes a dataset of labeled audio signals from infants with specific pathologies. The dataset includes two infant pathologies with high mortality rates, neonatal respiratory distress syndrome (RDS), sepsis, and crying. The system employed the harmonic ratio (HR) as a prosodic feature, the Gammatone frequency cepstral coefficients (GFCCs) as a cepstral feature, and image-based features through the spectrogram which are extracted using a convolution neural network (CNN) pretrained model and fused with the other features to benefit multiple domains in improving the classification rate and the accuracy of the model. The different combination of the fused features is then fed into multiple machine learning algorithms including random forest (RF), support vector machine (SVM), and deep neural network (DNN) models. The evaluation of the system using the accuracy, precision, recall, F1-score, confusion matrix, and receiver operating characteristic (ROC) curve, showed promising results for the early diagnosis of medical conditions in infants based on the crying signals only, where the system achieved the highest accuracy of 97.50% using the combination of the spectrogram, HR, and GFCC through the deep learning process. The finding demonstrated the importance of fusing different audio features, especially the spectrogram, through the learning process rather than a simple concatenation and the use of deep learning algorithms in extracting sparsely represented features that can be used later on in the classification problem, which improves the separation between different infants’ pathologies. The results outperformed the published benchmark paper by improving the classification problem to be multiclassification (RDS, sepsis, and healthy), investigating a new type of feature, which is the spectrogram, using a new feature fusion technique, which is fusion, through the learning process using the deep learning model.	en_US
dc.language.iso	en	en_US
dc.publisher	Diagnostics - MDPI	en_US
dc.subject	infant’s crying diagnosis	en_US
dc.subject	audio domains features	en_US
dc.subject	HR	en_US
dc.subject	GFCC	en_US
dc.subject	machine learning	en_US
dc.subject	deep learning	en_US
dc.subject	spectrogram	en_US
dc.title	Infant Cry Signal Diagnostic System Using Deep Learning and Fused Features	en_US
dc.type	Article	en_US
Appears in Collections:	Faculty & Staff Scientific Research publications

Files in This Item:

File	Description	Size	Format
diagnostics-13-02107 (1).pdf		6.53 MB	Adobe PDF	View/Open

Show simple item record

Admin Tools

ARAB AMERICAN UNIVERSITY Repository