Skip to content

Latest commit

 

History

History
49 lines (38 loc) · 2.34 KB

2024.09.05_FireRedTTS.md

File metadata and controls

49 lines (38 loc) · 2.34 KB

FireRedTTS

基本信息
  • 标题: "FireRedTTS: A Foundation Text-To-Speech Framework for Industry-Level Generative Speech Applications"
  • 作者:
    • 01 Hao-Han Guo,
    • 02 Kun Liu,
    • 03 Fei-Yu Shen,
    • 04 Yi-Chen Wu,
    • 05 Feng-Long Xie,
    • 06 Kun Xie,
    • 07 Kai-Tuo Xu
  • 链接:
  • 文件:
    • ArXiv
    • [Publication] #TODO

Abstract: 摘要

This work proposes FireRedTTS, a foundation text-to-speech framework, to meet the growing demands for personalized and diverse generative speech applications. The framework comprises three parts: data processing, foundation system, and downstream applications. First, we comprehensively present our data processing pipeline, which transforms massive raw audio into a large-scale high-quality TTS dataset with rich annotations and a wide coverage of content, speaking style, and timbre. Then, we propose a language-model-based foundation TTS system. The speech signal is compressed into discrete semantic tokens via a semantic-aware speech tokenizer, and can be generated by a language model from the prompt text and audio. Then, a two-stage waveform generator is proposed to decode them to the high-fidelity waveform. We present two applications of this system: voice cloning for dubbing and human-like speech generation for chatbots. The experimental results demonstrate the solid in-context learning capability of FireRedTTS, which can stably synthesize high-quality speech consistent with the prompt text and audio. For dubbing, FireRedTTS can clone target voices in a zero-shot way for the UGC scenario and adapt to studio-level expressive voice characters in the PUGC scenario via few-shot fine-tuning with 1-hour recording. Moreover, FireRedTTS achieves controllable human-like speech generation in a casual style with paralinguistic behaviors and emotions via instruction tuning, to better serve spoken chatbots.

1·Introduction: 引言

2·Related Works: 相关工作

3·Methodology: 方法

4·Experiments: 实验

5·Results: 结果

6·Conclusions: 结论