Informative Speech Features based on Emotion Classes and Gender in Explainable Speech Emotion Recognition

Research output: Chapter in Book/Report/Conference proceedingConference Article in proceedingAcademicpeer-review

Abstract

Emotions manifest in various aspects of human speech. While the tonality of the speech is a crucial indicator of emotions, other aspects such as word selection, pronunciation, and other paralinguistic features also provide valuable insights. Some of these aspects are considered universal, others are influenced by cultural and personal aspects, with gender being one of the most significant factors affecting emotional expressions. In this study, we aimed at investigating the effect of gender on emotional descriptors in speech. Specifically, we used intelligible paralinguistic speech features in Speech Emotion Recognition and employed Shapley values to measure the effect of gender on speech features. Furthermore, we empirically evaluated whether a reduced set of informative features could provide sufficient information for emotion recognition. Additionally, we investigated how gender influences auditory expressions of emotions.Our experiments show that besides the physical impact on fundamental speech frequencies, gender also affects how emotional phrases are spoken, and how prosody and phonology change. In addition to that, reducing the input size using the feature informativeness does not have a significant effect on the model accuracy whereas it shrinks the input size drastically by 98% on average. Finally, our comparative experiments on genders show that some speech features are more informative for capturing particular emotions exhibited by different genders. Therefore, we report that with a multi-layer feature set that consists of obscure and interpretable paralinguistic features, a novel data fusion approach could yield an explainable speech emotion recognition model. Furthermore, it is possible to reduce the input size and computational requirements by implementing feature reduction and gender information for speech emotion recognition tasks.
Original languageEnglish
Title of host publication11th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW)
PublisherIEEE
ISBN (Print)979-8-3503-2745-8
DOIs
Publication statusPublished - 16 Jan 2024
Event11th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos: ACIIW 2023 - MIT Media Lab, Cambridge, United States
Duration: 10 Sept 202313 Sept 2023
https://acii-conf.net/2023/

Conference

Conference11th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos
Country/TerritoryUnited States
CityCambridge
Period10/09/2313/09/23
Internet address

Keywords

  • TRUSTWORTHY_AI
  • AI
  • Affective Computing
  • Speech Emotion Recognition

Fingerprint

Dive into the research topics of 'Informative Speech Features based on Emotion Classes and Gender in Explainable Speech Emotion Recognition'. Together they form a unique fingerprint.

Cite this