Comparing Neural Networks for Speech Emotion Recognition in Customer Service Interactions

Research output: Chapter in Book/Report/Conference proceedingConference Article in proceedingAcademicpeer-review

Abstract

Automatic speech emotion recognition (SER) may assist call center service employees in deciphering and regulating customer emotions. In order to contribute to a successful augmentation of service employees with AI, the main goal of this study is to identify effective machine learning approaches to classify discrete basic emotions in customer service conversations. A comparison is presented of the recognition performance of different neural network architectures on speech features extracted from service interactions in a naturalistic customer service setting. Baseline classifiers, including a zerorule classifier, a random classifier, a frequency classifier, and nonsequential multi-class classifiers are compared to different neural network architectures. A multi-layer perceptron (MLP), a one-dimensional convolutional neural network (CNN), and a neural machine translation (NMT) outperform the baseline classifiers, suggesting a pattern in the data relating to emotion labels. While the neural machine translation model with attention attains the highest f1-score, no significant difference in performance among the neural networks is detected. Results therefore support the use of the the multi-label multi-layer perceptron as the simplest model.
Original languageEnglish
Title of host publication2022 International Joint Conference on Neural Networks ( IJCNN)
PublisherIEEE
Pages1-8
Number of pages8
ISBN (Electronic)978-1-7281-8671-9
ISBN (Print)978-1-6654-9526-4
DOIs
Publication statusPublished - 30 Sep 2022
Event2022 International Joint Conference on Neural Networks - Padua, Italy
Duration: 18 Jul 202223 Jul 2022

Conference

Conference2022 International Joint Conference on Neural Networks
Abbreviated titleIJCNN 2022
Country/TerritoryItaly
CityPadua
Period18/07/2223/07/22

Fingerprint

Dive into the research topics of 'Comparing Neural Networks for Speech Emotion Recognition in Customer Service Interactions'. Together they form a unique fingerprint.

Cite this