skip to main content
Language:
Search Limited to: Search Limited to: Resource type Show Results with: Show Results with: Search type Index

End-to-End Deep Convolutional Recurrent Models for Noise Robust Waveform Speech Enhancement

Sensors (Basel, Switzerland), 2022-10, Vol.22 (20), p.7782 [Peer Reviewed Journal]

COPYRIGHT 2022 MDPI AG ;2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. ;2022 by the authors. 2022 ;ISSN: 1424-8220 ;EISSN: 1424-8220 ;DOI: 10.3390/s22207782

Full text available

Citations Cited by
  • Title:
    End-to-End Deep Convolutional Recurrent Models for Noise Robust Waveform Speech Enhancement
  • Author: Ullah, Rizwan ; Wuttisittikulkij, Lunchakorn ; Chaudhary, Sushank ; Parnianifard, Amir ; Shah, Shashi ; Ibrar, Muhammad ; Wahab, Fazal-E
  • Subjects: Analysis ; Architecture ; Background noise ; Convolutional Encode-Decoder ; Convolutional Recurrent Network ; Datasets ; Deep learning ; E2E speech processing ; Experiments ; Intelligibility ; Mapping ; Neural networks ; Recurrent neural networks ; Robustness ; Speech ; Speech processing ; speech quality ; Waveforms
  • Is Part Of: Sensors (Basel, Switzerland), 2022-10, Vol.22 (20), p.7782
  • Description: Because of their simple design structure, end-to-end deep learning (E2E-DL) models have gained a lot of attention for speech enhancement. A number of DL models have achieved excellent results in eliminating the background noise and enhancing the quality as well as the intelligibility of noisy speech. Designing resource-efficient and compact models during real-time processing is still a key challenge. In order to enhance the accomplishment of E2E models, the sequential and local characteristics of speech signal should be efficiently taken into consideration while modeling. In this paper, we present resource-efficient and compact neural models for end-to-end noise-robust waveform-based speech enhancement. Combining the Convolutional Encode-Decoder (CED) and Recurrent Neural Networks (RNNs) in the Convolutional Recurrent Network (CRN) framework, we have aimed at different speech enhancement systems. Different noise types and speakers are used to train and test the proposed models. With LibriSpeech and the DEMAND dataset, the experiments show that the proposed models lead to improved quality and intelligibility with fewer trainable parameters, notably reduced model complexity, and inference time than existing recurrent and convolutional models. The quality and intelligibility are improved by 31.61% and 17.18% over the noisy speech. We further performed cross corpus analysis to demonstrate the generalization of the proposed E2E SE models across different speech datasets.
  • Publisher: Basel: MDPI AG
  • Language: English
  • Identifier: ISSN: 1424-8220
    EISSN: 1424-8220
    DOI: 10.3390/s22207782
  • Source: Freely Accessible Journals
    PubMed Central
    ROAD: Directory of Open Access Scholarly Resources
    ProQuest Central
    DOAJ Directory of Open Access Journals

Searching Remote Databases, Please Wait