Macedonian Speech Synthesis for Assistive Technology Applications
Date Issued
2022-05-18
Author(s)
Bojan Sofronievski
Elena Velovska
Martin Velichkovski
Violeta Argirova
Tea Veljkovikj
Risto Chavdarov
Stefan Janev
Kristijan Lazarev
Toni Bachvarovski
Zoran Ivanovski
Dimitar Tashkovski
Branislav Gerazov
Abstract
Speech technology is becoming ever more ubiquitous with the advance of speech
enabled devices and services. The use of speech synthesis in Augmentative and
Alternative Communication tools, has facilitated inclusion of individuals with
speech impediments allowing them to communicate with their surroundings using
speech. Although there are numerous speech synthesis systems for the most
spoken world languages, there is still a limited offer for smaller languages.
We propose and compare three models built using parametric and deep learning
techniques for Macedonian trained on a newly recorded corpus. We target
low-resource edge deployment for Augmentative and Alternative Communication and
assistive technologies, such as communication boards and screen readers. The
listening test results show that parametric speech synthesis is as performant
compared to the more advanced deep learning models. Since it also requires less
resources, and offers full speech rate and pitch control, it is the preferred
choice for building a Macedonian TTS system for this application scenario.
enabled devices and services. The use of speech synthesis in Augmentative and
Alternative Communication tools, has facilitated inclusion of individuals with
speech impediments allowing them to communicate with their surroundings using
speech. Although there are numerous speech synthesis systems for the most
spoken world languages, there is still a limited offer for smaller languages.
We propose and compare three models built using parametric and deep learning
techniques for Macedonian trained on a newly recorded corpus. We target
low-resource edge deployment for Augmentative and Alternative Communication and
assistive technologies, such as communication boards and screen readers. The
listening test results show that parametric speech synthesis is as performant
compared to the more advanced deep learning models. Since it also requires less
resources, and offers full speech rate and pitch control, it is the preferred
choice for building a Macedonian TTS system for this application scenario.
