Towards Real-time Speech Emotion Recognition for Affective E-Learning

Kiavash Bahreini, Rob Nadolski, Wim Westera

    Research output: Contribution to journalArticleAcademicpeer-review

    17 Downloads (Pure)

    Abstract

    This paper presents the voice emotion recognition part of the FILTWAM framework for real-time emotion recognition in affective e-learning settings. FILTWAM (Framework for Improving Learning Through Webcams And Microphones) intends to offer timely and appropriate online feedback based upon learner’s vocal intonations and facial expressions in order to foster their learning. Whereas the facial emotion recognition part has been successfully tested in a previous study, the here presented study describes the development and testing of FILTWAM's vocal emotion recognition software artefact. The main goal of this study was to show the valid use of computer microphone data for real-time and adequate interpretation of vocal intonations into extracted emotional states. The software that was developed was tested in a study with twelve participants. All participants individually received the same computer-based tasks in which they were requested eighty times to mimic specific vocal expressions (960 occurrences in total). Each individual session was recorded on video. For the validation of the voice emotion recognition software artefact, two experts annotated and rated participants' recorded behaviours. Expert findings were then compared with the software recognition results and showed an overall accuracy of Kappa of 0.743. The overall accuracy of the voice emotion recognition software artefact is 67% based on the requested emotions and the recognized emotions. Our FILTWAM-software allows to continually and unobtrusively observing learners’ behaviours and transforms these behaviours into emotional states. This paves the way for unobtrusive and real-time capturing of learners' emotional states for enhancing adaptive e-learning approaches.
    Original languageEnglish
    Pages (from-to)1367-1386
    JournalEducation and Information Technologies
    Volume21
    Issue number5
    DOIs
    Publication statusPublished - 15 Apr 2015

    Fingerprint

    emotion
    learning
    artifact
    electronic learning
    expert
    facial expression
    time
    software
    video
    interpretation

    Keywords

    • Speech interaction
    • Affective computing
    • Speech emotion recognition
    • Real-time software development
    • Evaluation methodology
    • Empirical study of user behaviour
    • E-learning
    • Microphone

    Cite this

    @article{744b60473f5a44e1bc53d5ca2dabff39,
    title = "Towards Real-time Speech Emotion Recognition for Affective E-Learning",
    abstract = "This paper presents the voice emotion recognition part of the FILTWAM framework for real-time emotion recognition in affective e-learning settings. FILTWAM (Framework for Improving Learning Through Webcams And Microphones) intends to offer timely and appropriate online feedback based upon learner’s vocal intonations and facial expressions in order to foster their learning. Whereas the facial emotion recognition part has been successfully tested in a previous study, the here presented study describes the development and testing of FILTWAM's vocal emotion recognition software artefact. The main goal of this study was to show the valid use of computer microphone data for real-time and adequate interpretation of vocal intonations into extracted emotional states. The software that was developed was tested in a study with twelve participants. All participants individually received the same computer-based tasks in which they were requested eighty times to mimic specific vocal expressions (960 occurrences in total). Each individual session was recorded on video. For the validation of the voice emotion recognition software artefact, two experts annotated and rated participants' recorded behaviours. Expert findings were then compared with the software recognition results and showed an overall accuracy of Kappa of 0.743. The overall accuracy of the voice emotion recognition software artefact is 67{\%} based on the requested emotions and the recognized emotions. Our FILTWAM-software allows to continually and unobtrusively observing learners’ behaviours and transforms these behaviours into emotional states. This paves the way for unobtrusive and real-time capturing of learners' emotional states for enhancing adaptive e-learning approaches.",
    keywords = "Speech interaction, Affective computing, Speech emotion recognition, Real-time software development, Evaluation methodology, Empirical study of user behaviour, E-learning, Microphone",
    author = "Kiavash Bahreini and Rob Nadolski and Wim Westera",
    note = "DS_Description: The original article is available as an open access file on the Springer website in the following link: http://link.springer.com/article/10.1007/s10639-015-9388-2 DS_Citation:Bahreini, K., Nadolski, R., & Westera, W. (2016). Towards Real-Time Speech Emotion Recognition for Affective E-Learning. Education and Information Technologies, 21(5), 1367-1386. Springer US. Doi:10.1007/s10639-015-9388-2 Print Issn:1360-2357",
    year = "2015",
    month = "4",
    day = "15",
    doi = "https://doi.org/10.1007/s10639-015-9388-2",
    language = "English",
    volume = "21",
    pages = "1367--1386",
    journal = "Education and Information Technologies",
    issn = "1360-2357",
    publisher = "Springer US",
    number = "5",

    }

    Towards Real-time Speech Emotion Recognition for Affective E-Learning. / Bahreini, Kiavash; Nadolski, Rob; Westera, Wim.

    In: Education and Information Technologies, Vol. 21, No. 5, 15.04.2015, p. 1367-1386.

    Research output: Contribution to journalArticleAcademicpeer-review

    TY - JOUR

    T1 - Towards Real-time Speech Emotion Recognition for Affective E-Learning

    AU - Bahreini, Kiavash

    AU - Nadolski, Rob

    AU - Westera, Wim

    N1 - DS_Description: The original article is available as an open access file on the Springer website in the following link: http://link.springer.com/article/10.1007/s10639-015-9388-2 DS_Citation:Bahreini, K., Nadolski, R., & Westera, W. (2016). Towards Real-Time Speech Emotion Recognition for Affective E-Learning. Education and Information Technologies, 21(5), 1367-1386. Springer US. Doi:10.1007/s10639-015-9388-2 Print Issn:1360-2357

    PY - 2015/4/15

    Y1 - 2015/4/15

    N2 - This paper presents the voice emotion recognition part of the FILTWAM framework for real-time emotion recognition in affective e-learning settings. FILTWAM (Framework for Improving Learning Through Webcams And Microphones) intends to offer timely and appropriate online feedback based upon learner’s vocal intonations and facial expressions in order to foster their learning. Whereas the facial emotion recognition part has been successfully tested in a previous study, the here presented study describes the development and testing of FILTWAM's vocal emotion recognition software artefact. The main goal of this study was to show the valid use of computer microphone data for real-time and adequate interpretation of vocal intonations into extracted emotional states. The software that was developed was tested in a study with twelve participants. All participants individually received the same computer-based tasks in which they were requested eighty times to mimic specific vocal expressions (960 occurrences in total). Each individual session was recorded on video. For the validation of the voice emotion recognition software artefact, two experts annotated and rated participants' recorded behaviours. Expert findings were then compared with the software recognition results and showed an overall accuracy of Kappa of 0.743. The overall accuracy of the voice emotion recognition software artefact is 67% based on the requested emotions and the recognized emotions. Our FILTWAM-software allows to continually and unobtrusively observing learners’ behaviours and transforms these behaviours into emotional states. This paves the way for unobtrusive and real-time capturing of learners' emotional states for enhancing adaptive e-learning approaches.

    AB - This paper presents the voice emotion recognition part of the FILTWAM framework for real-time emotion recognition in affective e-learning settings. FILTWAM (Framework for Improving Learning Through Webcams And Microphones) intends to offer timely and appropriate online feedback based upon learner’s vocal intonations and facial expressions in order to foster their learning. Whereas the facial emotion recognition part has been successfully tested in a previous study, the here presented study describes the development and testing of FILTWAM's vocal emotion recognition software artefact. The main goal of this study was to show the valid use of computer microphone data for real-time and adequate interpretation of vocal intonations into extracted emotional states. The software that was developed was tested in a study with twelve participants. All participants individually received the same computer-based tasks in which they were requested eighty times to mimic specific vocal expressions (960 occurrences in total). Each individual session was recorded on video. For the validation of the voice emotion recognition software artefact, two experts annotated and rated participants' recorded behaviours. Expert findings were then compared with the software recognition results and showed an overall accuracy of Kappa of 0.743. The overall accuracy of the voice emotion recognition software artefact is 67% based on the requested emotions and the recognized emotions. Our FILTWAM-software allows to continually and unobtrusively observing learners’ behaviours and transforms these behaviours into emotional states. This paves the way for unobtrusive and real-time capturing of learners' emotional states for enhancing adaptive e-learning approaches.

    KW - Speech interaction

    KW - Affective computing

    KW - Speech emotion recognition

    KW - Real-time software development

    KW - Evaluation methodology

    KW - Empirical study of user behaviour

    KW - E-learning

    KW - Microphone

    U2 - https://doi.org/10.1007/s10639-015-9388-2

    DO - https://doi.org/10.1007/s10639-015-9388-2

    M3 - Article

    VL - 21

    SP - 1367

    EP - 1386

    JO - Education and Information Technologies

    JF - Education and Information Technologies

    SN - 1360-2357

    IS - 5

    ER -