skip to main content
Language:
Search Limited to: Search Limited to: Resource type Show Results with: Show Results with: Search type Index

Results 1 - 20 of 4,215  for All Library Resources

Results 1 2 3 4 5 next page
Show only
Refined by: Journal Title: Arxiv remove
Result Number Material Type Add to My Shelf Action Record Details and Options
1
Differentially Private Speaker Anonymization
Material Type:
Article
Add to My Research

Differentially Private Speaker Anonymization

arXiv.org, 2023-01 [Peer Reviewed Journal]

2022. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. ;http://creativecommons.org/licenses/by/4.0 ;EISSN: 2331-8422 ;DOI: 10.48550/arxiv.2202.11823

Full text available

2
Predicting Affective Vocal Bursts with Finetuned wav2vec 2.0
Material Type:
Article
Add to My Research

Predicting Affective Vocal Bursts with Finetuned wav2vec 2.0

arXiv.org, 2022-09

2022. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. ;http://arxiv.org/licenses/nonexclusive-distrib/1.0 ;EISSN: 2331-8422 ;DOI: 10.48550/arxiv.2209.13146

Full text available

3
A Comparison of Deep Learning MOS Predictors for Speech Synthesis Quality
Material Type:
Article
Add to My Research

A Comparison of Deep Learning MOS Predictors for Speech Synthesis Quality

arXiv.org, 2022-04

2022. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. ;http://creativecommons.org/licenses/by/4.0 ;EISSN: 2331-8422 ;DOI: 10.48550/arxiv.2204.02249

Full text available

4
Mixer-TTS: non-autoregressive, fast and compact text-to-speech model conditioned on language model embeddings
Material Type:
Article
Add to My Research

Mixer-TTS: non-autoregressive, fast and compact text-to-speech model conditioned on language model embeddings

arXiv.org, 2021-10

2021. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. ;http://creativecommons.org/licenses/by/4.0 ;EISSN: 2331-8422 ;DOI: 10.48550/arxiv.2110.03584

Full text available

5
E2E-based Multi-task Learning Approach to Joint Speech and Accent Recognition
Material Type:
Article
Add to My Research

E2E-based Multi-task Learning Approach to Joint Speech and Accent Recognition

arXiv.org, 2021-06

2021. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. ;http://arxiv.org/licenses/nonexclusive-distrib/1.0 ;EISSN: 2331-8422 ;DOI: 10.48550/arxiv.2106.08211

Full text available

6
Pay Attention to Hard Trials
Material Type:
Article
Add to My Research

Pay Attention to Hard Trials

arXiv.org, 2022-09

2022. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. ;http://arxiv.org/licenses/nonexclusive-distrib/1.0 ;EISSN: 2331-8422 ;DOI: 10.48550/arxiv.2209.04687

Full text available

7
nnSpeech: Speaker-Guided Conditional Variational Autoencoder for Zero-shot Multi-speaker Text-to-Speech
Material Type:
Article
Add to My Research

nnSpeech: Speaker-Guided Conditional Variational Autoencoder for Zero-shot Multi-speaker Text-to-Speech

arXiv.org, 2022-02

2022. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. ;http://creativecommons.org/licenses/by/4.0 ;EISSN: 2331-8422 ;DOI: 10.48550/arxiv.2202.10712

Full text available

8
ASR-Free Pronunciation Assessment
Material Type:
Article
Add to My Research

ASR-Free Pronunciation Assessment

arXiv.org, 2020-05

2020. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. ;http://arxiv.org/licenses/nonexclusive-distrib/1.0 ;EISSN: 2331-8422 ;DOI: 10.48550/arxiv.2005.11902

Full text available

9
Improving the quality of neural TTS using long-form content and multi-speaker multi-style modeling
Material Type:
Article
Add to My Research

Improving the quality of neural TTS using long-form content and multi-speaker multi-style modeling

arXiv.org, 2023-06

2023. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. ;http://arxiv.org/licenses/nonexclusive-distrib/1.0 ;EISSN: 2331-8422 ;DOI: 10.48550/arxiv.2212.10075

Full text available

10
Towards multi-task learning of speech and speaker recognition
Material Type:
Article
Add to My Research

Towards multi-task learning of speech and speaker recognition

arXiv.org, 2023-05

2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. ;http://creativecommons.org/licenses/by/4.0 ;EISSN: 2331-8422 ;DOI: 10.48550/arxiv.2302.12773

Full text available

11
Joint Speech Translation and Named Entity Recognition
Material Type:
Article
Add to My Research

Joint Speech Translation and Named Entity Recognition

arXiv.org, 2023-05

2023. This work is published under http://creativecommons.org/licenses/by-sa/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. ;http://creativecommons.org/licenses/by-sa/4.0 ;EISSN: 2331-8422 ;DOI: 10.48550/arxiv.2210.11987

Full text available

12
Improving Generalization Ability of Countermeasures for New Mismatch Scenario by Combining Multiple Advanced Regularization Terms
Material Type:
Article
Add to My Research

Improving Generalization Ability of Countermeasures for New Mismatch Scenario by Combining Multiple Advanced Regularization Terms

arXiv.org, 2023-05

2023. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. ;http://arxiv.org/licenses/nonexclusive-distrib/1.0 ;EISSN: 2331-8422 ;DOI: 10.48550/arxiv.2305.10940

Full text available

13
Multi-query multi-head attention pooling and Inter-topK penalty for speaker verification
Material Type:
Article
Add to My Research

Multi-query multi-head attention pooling and Inter-topK penalty for speaker verification

arXiv.org, 2021-10

2021. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. ;http://arxiv.org/licenses/nonexclusive-distrib/1.0 ;EISSN: 2331-8422 ;DOI: 10.48550/arxiv.2110.05042

Full text available

14
UFO2: A unified pre-training framework for online and offline speech recognition
Material Type:
Article
Add to My Research

UFO2: A unified pre-training framework for online and offline speech recognition

arXiv.org, 2023-04

2023. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. ;http://arxiv.org/licenses/nonexclusive-distrib/1.0 ;EISSN: 2331-8422 ;DOI: 10.48550/arxiv.2210.14515

Full text available

15
Can Knowledge of End-to-End Text-to-Speech Models Improve Neural MIDI-to-Audio Synthesis Systems?
Material Type:
Article
Add to My Research

Can Knowledge of End-to-End Text-to-Speech Models Improve Neural MIDI-to-Audio Synthesis Systems?

arXiv.org, 2023-03

2023. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. ;http://arxiv.org/licenses/nonexclusive-distrib/1.0 ;EISSN: 2331-8422 ;DOI: 10.48550/arxiv.2211.13868

Full text available

16
Streaming Multi-Talker ASR with Token-Level Serialized Output Training
Material Type:
Article
Add to My Research

Streaming Multi-Talker ASR with Token-Level Serialized Output Training

arXiv.org, 2022-07

2022. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. ;http://arxiv.org/licenses/nonexclusive-distrib/1.0 ;EISSN: 2331-8422 ;DOI: 10.48550/arxiv.2202.00842

Full text available

17
TTS-Guided Training for Accent Conversion Without Parallel Data
Material Type:
Article
Add to My Research

TTS-Guided Training for Accent Conversion Without Parallel Data

arXiv.org, 2022-12

2022. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. ;http://arxiv.org/licenses/nonexclusive-distrib/1.0 ;EISSN: 2331-8422 ;DOI: 10.48550/arxiv.2212.10204

Full text available

18
The ISCSLP 2022 Intelligent Cockpit Speech Recognition Challenge (ICSRC): Dataset, Tracks, Baseline and Results
Material Type:
Article
Add to My Research

The ISCSLP 2022 Intelligent Cockpit Speech Recognition Challenge (ICSRC): Dataset, Tracks, Baseline and Results

arXiv.org, 2022-11

2022. This work is published under http://creativecommons.org/licenses/by-sa/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. ;http://creativecommons.org/licenses/by-sa/4.0 ;EISSN: 2331-8422 ;DOI: 10.48550/arxiv.2211.01585

Full text available

19
Dysfluencies Seldom Come Alone -- Detection as a Multi-Label Problem
Material Type:
Article
Add to My Research

Dysfluencies Seldom Come Alone -- Detection as a Multi-Label Problem

arXiv.org, 2022-10

2022. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. ;http://arxiv.org/licenses/nonexclusive-distrib/1.0 ;EISSN: 2331-8422 ;DOI: 10.48550/arxiv.2210.15982

Full text available

20
Expressive Text-to-Speech using Style Tag
Material Type:
Article
Add to My Research

Expressive Text-to-Speech using Style Tag

arXiv.org, 2022-10

2022. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. ;http://creativecommons.org/licenses/by/4.0 ;EISSN: 2331-8422 ;DOI: 10.48550/arxiv.2104.00436

Full text available

Results 1 - 20 of 4,215  for All Library Resources

Results 1 2 3 4 5 next page

Personalize your results

  1. Edit

Refine Search Results

Expand My Results

  1.   

Show only

  1. Peer-reviewed Journals (1)

Searching Remote Databases, Please Wait