Abstract
Automatic speech emotion recognition (SER) may assist call center service employees in deciphering and regulating customer emotions. In order to contribute to a successful augmentation of service employees with AI, the main goal of this study is to identify effective machine learning approaches to classify discrete basic emotions in customer service conversations. A comparison is presented of the recognition performance of different neural network architectures on speech features extracted from service interactions in a naturalistic customer service setting. Baseline classifiers, including a zerorule classifier, a random classifier, a frequency classifier, and nonsequential multi-class classifiers are compared to different neural network architectures. A multi-layer perceptron (MLP), a one-dimensional convolutional neural network (CNN), and a neural machine translation (NMT) outperform the baseline classifiers, suggesting a pattern in the data relating to emotion labels. While the neural machine translation model with attention attains the highest f1-score, no significant difference in performance among the neural networks is detected. Results therefore support the use of the the multi-label multi-layer perceptron as the simplest model.
Original language | English |
---|---|
Title of host publication | 2022 International Joint Conference on Neural Networks ( IJCNN) |
Publisher | IEEE |
Pages | 1-8 |
Number of pages | 8 |
ISBN (Electronic) | 978-1-7281-8671-9 |
ISBN (Print) | 978-1-6654-9526-4 |
DOIs | |
Publication status | Published - 30 Sept 2022 |
Event | 2022 International Joint Conference on Neural Networks - Padua, Italy Duration: 18 Jul 2022 → 23 Jul 2022 |
Conference
Conference | 2022 International Joint Conference on Neural Networks |
---|---|
Abbreviated title | IJCNN 2022 |
Country/Territory | Italy |
City | Padua |
Period | 18/07/22 → 23/07/22 |
Keywords
- Call center service interactions
- Deep neural networks
- Neural machine translation
- Speech emotion recognition