LuckyVoice

Inspired by the host competition in 2019, this repository tries to use Zou Yun's voice to build a high-expressive speech synthesis system. The pinyin of 邹韵 is Zōu yùn, which is a homonym for good luck.

HuggingFace🤗 Demo-Baker | HuggingFace🤗 Demo-Lucky | WIP

1. Data Collection and Processing

1.1 Collect related videos of Zou Yun

1. Use the 'you-get' tool to download videos in batches, and the video address is in dataprocessing/collectvideos/main.py.
2. Use a format converter to convert video to wav files.

1.2 Split the audio using the vad method.

python dataprocessing/vad/main.py --pth [downloaded video] --savepth [Save address of split audio]

1.3 Noise reduction using speech enhancement model.

pre-trained model

sudo docker build -t se .
sudo docker run -it --rm -v /home/admin/yuanxin:/se se
python dataprocessing/se/main.py

1.4 Classify audio using a voiceprint recognition model.

1.5 Processing text with a speech recognition and speech synthesis front-end

speech synthesis front-end

2. Baseline Model

2.1 VITS model with prosodic representation

pretrained_baker.pth

python model/vits/main.py --text ['你好'] --out [The address to save the file]

Name	Name	Last commit message	Last commit date
Latest commit xin.yuan Initial commit Jan 17, 2023 2655b8a · Jan 17, 2023 History 1 Commit
dataprocessing	dataprocessing	Initial commit	Jan 17, 2023
model/vits	model/vits	Initial commit	Jan 17, 2023
README.md	README.md	Initial commit	Jan 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LuckyVoice

1. Data Collection and Processing

1.1 Collect related videos of Zou Yun

1.2 Split the audio using the vad method.

1.3 Noise reduction using speech enhancement model.

1.4 Classify audio using a voiceprint recognition model.

1.5 Processing text with a speech recognition and speech synthesis front-end

2. Baseline Model

2.1 VITS model with prosodic representation

2.2 DiffSpeech model with prosodic representation

3. EmpathyTTS

About

Releases

Packages

Languages

yuan1615/LuckyVoice

Folders and files

Latest commit

History

Repository files navigation

LuckyVoice

1. Data Collection and Processing

1.1 Collect related videos of Zou Yun

1.2 Split the audio using the vad method.

1.3 Noise reduction using speech enhancement model.

1.4 Classify audio using a voiceprint recognition model.

1.5 Processing text with a speech recognition and speech synthesis front-end

2. Baseline Model

2.1 VITS model with prosodic representation

2.2 DiffSpeech model with prosodic representation

3. EmpathyTTS

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages