Language Agnostic Voice Recognition Model
Date Issued
2022
Author(s)
Janeva, Tea
Simjanoska, Monika
Abstract
Voice recognition is the ability of a machine to
identify a person based on their unique voiceprint. As this
task is becoming more important and dominant in everyday
people’s lives, this paper is testing different approaches for its
implementation. Using a multilanguage database and working
with the different frequencies’ characteristics, five machine
learning models such as Random Forest, XGBoost, MLP, SVM
and Gradient Boosting, along with CNN deep learning model
were implemented. The models were trained on three different
tasks, gender prediction, age range prediction, and combined
gender and age range prediction. These models were evaluated
using accuracy, F1-score and MCC score. The results showed
that Random Forest outperforms other models by achieving an
accuracy of more than 0.9 for all the three classification tasks.
identify a person based on their unique voiceprint. As this
task is becoming more important and dominant in everyday
people’s lives, this paper is testing different approaches for its
implementation. Using a multilanguage database and working
with the different frequencies’ characteristics, five machine
learning models such as Random Forest, XGBoost, MLP, SVM
and Gradient Boosting, along with CNN deep learning model
were implemented. The models were trained on three different
tasks, gender prediction, age range prediction, and combined
gender and age range prediction. These models were evaluated
using accuracy, F1-score and MCC score. The results showed
that Random Forest outperforms other models by achieving an
accuracy of more than 0.9 for all the three classification tasks.
Subjects
File(s)![Thumbnail Image]()
Loading...
Name
CIIT_2022_1.pdf
Size
259.68 KB
Format
Adobe PDF
Checksum
(MD5):d5446832e532cc1f8ef16a337c818b67
