Please use this identifier to cite or link to this item:
http://hdl.handle.net/20.500.12188/27402
Title: | Representation Learning for Automatic Speech Recognition: A Review of Speech-to-Text Methods | Authors: | Mitreska, Maja Penkova, Blagica Mishev, Kostadin Simjanoska, Monika |
Keywords: | Speech-to-text, representation learning | Issue Date: | Jul-2023 | Publisher: | Ss Cyril and Methodius University in Skopje, Faculty of Computer Science and Engineering, Republic of North Macedonia | Series/Report no.: | CIIT 2023 papers;27; | Conference: | 20th International Conference on Informatics and Information Technologies - CIIT 2023 | Abstract: | Representation learning has emerged as a promising approach to overcoming the limitations of discriminative repre sentations from the raw speech signal. In this review, we cover a range of speech-to-text methods that employ representation learning, including deep neural networks (DNNs), convolutional neural networks (CNNs), recurrent neural networks (RNNs), and transformer-based models. The advantages and limitations of each approach are described, as well as recent advances in pretraining techniques such as contrastive predictive coding (CPC) and masked language modelling (MLM). The reviewed papers are divided according to their novelty, their approaches and their type of representation learning models. | URI: | http://hdl.handle.net/20.500.12188/27402 |
Appears in Collections: | Faculty of Computer Science and Engineering: Conference papers |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
CIIT2023_paper_27.pdf | 9.19 MB | Adobe PDF | View/Open |
Page view(s)
141
checked on Nov 6, 2024
Download(s)
93
checked on Nov 6, 2024
Google ScholarTM
Check
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.