Representation Learning for Automatic Speech Recognition: A Review of Speech-to-Text Methods
Date Issued
2023-07
Author(s)
Mitreska, Maja
Penkova, Blagica
Simjanoska, Monika
Abstract
Representation learning has emerged as a promising
approach to overcoming the limitations of discriminative repre sentations from the raw speech signal. In this review, we cover
a range of speech-to-text methods that employ representation
learning, including deep neural networks (DNNs), convolutional
neural networks (CNNs), recurrent neural networks (RNNs),
and transformer-based models. The advantages and limitations
of each approach are described, as well as recent advances
in pretraining techniques such as contrastive predictive coding
(CPC) and masked language modelling (MLM). The reviewed
papers are divided according to their novelty, their approaches
and their type of representation learning models.
approach to overcoming the limitations of discriminative repre sentations from the raw speech signal. In this review, we cover
a range of speech-to-text methods that employ representation
learning, including deep neural networks (DNNs), convolutional
neural networks (CNNs), recurrent neural networks (RNNs),
and transformer-based models. The advantages and limitations
of each approach are described, as well as recent advances
in pretraining techniques such as contrastive predictive coding
(CPC) and masked language modelling (MLM). The reviewed
papers are divided according to their novelty, their approaches
and their type of representation learning models.
Subjects
File(s)![Thumbnail Image]()
Loading...
Name
CIIT2023_paper_27.pdf
Size
8.97 MB
Format
Adobe PDF
Checksum
(MD5):bdc73d0badd18c196562474a464c73e6
