Skip to content

yuan1615/LuckyVoice

Folders and files

NameName
Last commit message
Last commit date

Latest commit

author
xin.yuan
Jan 17, 2023
2655b8a · Jan 17, 2023

History

1 Commit
Jan 17, 2023
Jan 17, 2023
Jan 17, 2023

Repository files navigation

LuckyVoice

Inspired by the host competition in 2019, this repository tries to use Zou Yun's voice to build a high-expressive speech synthesis system. The pinyin of 邹韵 is Zōu yùn, which is a homonym for good luck.

HuggingFace🤗 Demo-Baker | HuggingFace🤗 Demo-Lucky | WIP

1. Data Collection and Processing

1.1 Collect related videos of Zou Yun

1. Use the 'you-get' tool to download videos in batches, and the video address is in dataprocessing/collectvideos/main.py.
2. Use a format converter to convert video to wav files.

1.2 Split the audio using the vad method.

python dataprocessing/vad/main.py --pth [downloaded video] --savepth [Save address of split audio]

1.3 Noise reduction using speech enhancement model.

pre-trained model

sudo docker build -t se .
sudo docker run -it --rm -v /home/admin/yuanxin:/se se
python dataprocessing/se/main.py

1.4 Classify audio using a voiceprint recognition model.

1.5 Processing text with a speech recognition and speech synthesis front-end

speech synthesis front-end

2. Baseline Model

2.1 VITS model with prosodic representation

pretrained_baker.pth

python model/vits/main.py --text ['你好'] --out [The address to save the file]

2.2 DiffSpeech model with prosodic representation

3. EmpathyTTS

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published