Machine learning approach for emotion recognition in speech

Gjoreski, Martin; Gjoreski, Hristijan; Kulakov, Andrea

Machine learning approach for emotion recognition in speech

Journal

Informatica

Date Issued

2014

Author(s)

Gjoreski, Martin

Gjoreski, Hristijan

Abstract

This paper presents a machine learning approach to automatic recognition of human emotions from
speech. The approach consists of three steps. First, numerical features are extracted from the sound
database by using audio feature extractor. Then, feature selection method is used to select the most
relevant features. Finally, a machine learning model is trained to recognize seven universal emotions:
anger, fear, sadness, happiness, boredom, disgust and neutral. A thorough ML experimental analysis is
performed for each step. The results showed that 300 (out of 1582) features, as ranked by the gain ratio,
are sufficient for achieving 86% accuracy when evaluated with 10 fold cross-validation. SVM achieved
the highest accuracy when compared to KNN and Naive Bayes. We additionally compared the accuracy
of the standard SVM (with default parameters) and the one enhanced by Auto-WEKA (optimized
algorithm parameters) using the leave-one-speaker-out technique. The results showed that the SVM
enhanced with Auto-WEKA achieved significantly better accuracy than the standard SVM, i.e., 73% and
77% respectively. Finally, the results achieved with the 10 fold cross-validation are comparable and
similar to the ones achieved by a human, i.e., 86% accuracy in both cases. Even more, low energy
emotions (boredom, sadness and disgust) are better recognized by our machine learning approach compared to the human.

Subjects

machine learning, emo...

File(s)

Name

719-722-1-PB.pdf

Size

152.54 KB

Format

Adobe PDF

Checksum

(MD5):abda7f95fc0c91a86904c229803ed17f