Comparing Neural Networks for Speech Emotion Recognition in Customer Service Interactions

Research output: Chapter in Book/Report/Conference proceedingConference Article in proceedingAcademicpeer-review


Automatic speech emotion recognition (SER) may assist call center service employees in deciphering and regulating customer emotions. In order to contribute to a successful augmentation of service employees with AI, the main goal of this study is to identify effective machine learning approaches to classify discrete basic emotions in customer service conversations. A comparison is presented of the recognition performance of different neural network architectures on speech features extracted from service interactions in a naturalistic customer service setting. Baseline classifiers, including a zerorule classifier, a random classifier, a frequency classifier, and nonsequential multi-class classifiers are compared to different neural network architectures. A multi-layer perceptron (MLP), a one-dimensional convolutional neural network (CNN), and a neural machine translation (NMT) outperform the baseline classifiers, suggesting a pattern in the data relating to emotion labels. While the neural machine translation model with attention attains the highest f1-score, no significant difference in performance among the neural networks is detected. Results therefore support the use of the the multi-label multi-layer perceptron as the simplest model.
Original languageEnglish
Title of host publication2022 International Joint Conference on Neural Networks ( IJCNN)
Number of pages8
ISBN (Electronic)978-1-7281-8671-9
ISBN (Print)978-1-6654-9526-4
Publication statusPublished - 30 Sept 2022
Event2022 International Joint Conference on Neural Networks - Padua, Italy
Duration: 18 Jul 202223 Jul 2022


Conference2022 International Joint Conference on Neural Networks
Abbreviated titleIJCNN 2022


Dive into the research topics of 'Comparing Neural Networks for Speech Emotion Recognition in Customer Service Interactions'. Together they form a unique fingerprint.

Cite this