Language:

End-to-end indonesian speech recognition with convolutional and gated recurrent units

Journal of physics. Conference series, 2020-06, Vol.1566 (1), p.12118 [Peer Reviewed Journal]

Published under licence by IOP Publishing Ltd ;2020. This work is published under http://creativecommons.org/licenses/by/3.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. ;ISSN: 1742-6588 ;EISSN: 1742-6596 ;DOI: 10.1088/1742-6596/1566/1/012118

Full text available

Citations Cited by

Actions
1. Add to My Research
2. Remove from My Research
3. E-mail
4. Print
5. Permalink
6. Citation
7. EasyBib
8. EndNote
9. RefWorks
10. Delicious
11. Export RIS
12. Export BibTeX

Title:
End-to-end indonesian speech recognition with convolutional and gated recurrent units
Author: Adiwidjaja, Rifqi ; Ivan Fanany, M
Subjects: Automatic speech recognition ; Deep learning ; Indonesian language ; Machine learning ; Physics ; Speech recognition ; Spoken language ; Voice recognition
Is Part Of: Journal of physics. Conference series, 2020-06, Vol.1566 (1), p.12118
Description: Automatic Speech Recognition has penetrated deeply into our life. For well-resourced language, it can be considered as solved, but that's not the case for under-resourced language like Bahasa. Although it's the 7th most spoken language in the world, the research of speech recognition for Bahasa was still extremely limited, with setting still inconvenient for the real world and industry. This research is an attempt to make a speech recognition model that has applicability to the real world and industry, specifically that supports sentence level input with variable character length with end-to-end training. We built the model using the deep learning approach, specifically utilizing the residual networks and Bi-Directional Gated Recurrent Unit (Bi-GRU). To the best of our knowledge, this is the first Indonesian ASR model that can be trained in an end-to-end manner. Our model surpassed the baseline model on all metrics and achieve competitiveness with the current best result, which used the visual modal, for the dataset even with a more difficult and prone to noise modality like sound.
Publisher: Bristol: IOP Publishing
Language: English
Identifier: ISSN: 1742-6588
EISSN: 1742-6596
DOI: 10.1088/1742-6596/1566/1/012118
Source: IOP Publishing
Geneva Foundation Free Medical Journals at publisher websites
IOPscience (Open Access)
ProQuest Databases

Back to results list


INSPIRE LIBRARY - TON DUC THANG UNIVERSITY	(84-028) 37 755 057	Feedback
19 Nguyen Huu Tho St. Dist.7, HCM	thuvien@tdtu.edu.vn	Feedback

End-to-end indonesian speech recognition with convolutional and gated recurrent units

Searching Remote Databases, Please Wait