Data Fusion for Real-time Multimodal Emotion Recognition through Webcams and Microphones in E-Learning

Kiavash Bahreini, Rob Nadolski, Wim Westera

    Research output: Contribution to journalArticleAcademicpeer-review

    27 Downloads (Pure)


    This paper describes the validation study of our software that uses combined webcam and microphone data for real-time, continuous, unobtrusive emotion recognition as part of our FILTWAM framework. FILTWAM aims at deploying a real time multimodal emotion recognition method for providing more adequate feedback to the learners through an online communication skills training. Herein, timely feedback is needed that reflects on their shown intended emotions and which is also useful to increase learners’ awareness of their own behaviour. At least, a reliable and valid software interpretation of performed face and voice emotions is needed to warrant such adequate feedback. This validation study therefore calibrates our software. The study uses a multimodal fusion method. Twelve test persons performed computer-based tasks in which they were asked to mimic specific facial and vocal emotions. All test persons’ behaviour was recorded on video and two raters independently scored the showed emotions, which were contrasted with the software recognition outcomes. A hybrid method for multimodal fusion of our multimodal software shows accuracy between 96.1% and 98.6% for the best-chosen WEKA classifiers over predicted emotions. The software fulfils its requirements of real-time data interpretation and reliable results.
    Original languageEnglish
    Pages (from-to)415-430
    JournalInternational Journal of Human-Computer Interaction
    Issue number5
    Publication statusPublished - 2 Mar 2016


    • Hybrid Data Fusion
    • Multimodal Emotion Recognition
    • Emotion Detection
    • Real-time Software Development
    • Software Development
    • Data Mining
    • WEKA Classifiers
    • Machine Learning
    • Webcam
    • Microphone


    Dive into the research topics of 'Data Fusion for Real-time Multimodal Emotion Recognition through Webcams and Microphones in E-Learning'. Together they form a unique fingerprint.

    Cite this