DNN-TTS-ContVoc: Fully Text-To-Speech Demo using Continuous Vocoder

This repository contains a TTS system based on Continuous vocoder developed at the Speech Technology and Smart Interactions Laboratory (SmartLab), Budapest University of Technology and Economics.

As a difference with other traditonal statistical parametric vocoders, continuous model focuses on extracting continuous parameters:

Fundamental Frequency (F0)
Maximum Voiced Freuqency (MVF)
Mel-Generalized Cepstral (MGC)

Continuous DNN-TTS

Besides feed-forward neural networks, this demo also supports recurrent neural networks (RNNs):

Long short-term memory (LSTM)
Bidirectional LSTM (BLSTM)
Gated recurrent units (GRU)

Installation

You need to have installed:

compiles: bash tools/compile_tools.sh
python dependencies: pip install -r requirements.txt
festival: sudo apt-get install festival

Run demo

To run this demo, ./egs/slt_arctic/s1/run_full_voice.sh script will:

1. Check for missing packages

The first step is to check continuous vocoder requirements in your system.

./01_chk_rqmts.sh

2. Setting up

The second step is to run setup as it creates directories and downloads the required training data files.

./02_setup.sh slt_arctic_full

OR

./02_setup.sh bdl_arctic_full

It also creates a global config file: conf/global_settings.cfg, where default settings are stored.

Directory structure:

.
├── misc
│   └── scripts
│       └── vocoder
│           ├── continuous        
│           └── ...
├── egs                     
│   └── slt_arctic
│       └── s1
│           ├── run_full_voice.sh
│           ├── conf
│           ├── scripts
│           └── experiments
│               └── slt_arctic_full                      
│                   ├── acoustic_model                  
│                   ├── duration_model                        
│                   └── test_synthesis
├── src
└── tools

3. Prepare config files

At this point, we have to prepare two config files to train DNN models

Acoustic Model
Duration Model

To prepare config files:

./03_prepare_conf_files.sh conf/global_settings.cfg

4. Train duration model

To train duration model:

./04_train_duration_model.sh conf/duration_slt_arctic_full.conf

OR

./04_train_duration_model.sh conf/duration_bdl_arctic_full.conf

5. Train acoustic model

To train acoustic model:

./05_train_acoustic_model.sh conf/acoustic_slt_arctic_full.conf

OR

./05_train_acoustic_model.sh conf/acoustic_bdl_arctic_full.conf

6. Synthesize speech

To synthesize speech with continuous vocoder:

./06_run_merlin.sh conf/test_dur_synth_slt_arctic_full.conf conf/test_synth_slt_arctic_full.conf

OR

./06_run_merlin.sh conf/test_dur_synth_bdl_arctic_full.conf conf/test_synth_bdl_arctic_full.conf

The synthesised waveforms will be stored in: /<experiment_dir>/test_synthesis/wav

Test TTS demo with continuous vocoder

If you want to test the trained version, ./tts_demo.sh script will:

Create the txt directory in experiments/slt_arctic_full/test_synthesis.
Ask you to enter a new sentenece.
Synthesise speech with continuous vocoder

Contact Us

Post your questions, suggestions, and discussions to GitHub Issues.

Speech Technology and Smart Interactions Laboratory

Citation

If you publish work based on Continuous TTS, please cite:

Al-Radhi M.S., Csapó T.G., Németh G. (2020). conTTS: Text-to-Speech Application using a Continuous Vocoder. In: Accepted to ISSP 2020. Audio Samples.
Al-Radhi M.S., Csapó T.G., Németh G. (2020). Continuous Noise Masking Based Vocoder for Statistical Parametric Speech Synthesis. IEICE Transactions on Information and Systems, E103.D(5), pp. 1099-1107. Audio Samples.
Al-Radhi M.S., Csapó T.G., Németh G. (2017). Deep Recurrent Neural Networks in Speech Synthesis Using a Continuous Vocoder. In: Karpov A., Potapova R., Mporas I. (eds) Speech and Computer. SPECOM 2017. Lecture Notes in Computer Science, vol 10458. Springer, Cham, Hatfield, UK.

Name		Name	Last commit message	Last commit date
Latest commit History 503 Commits
egs		egs
misc		misc
src		src
test		test
tools		tools
.gitignore		.gitignore
.travis.yml		.travis.yml
COPYING		COPYING
CREDITS.md		CREDITS.md
INSTALL.md		INSTALL.md
LICENSE		LICENSE
README.md		README.md
model.png		model.png
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Uh oh!

Repository files navigation

DNN-TTS-ContVoc: Fully Text-To-Speech Demo using Continuous Vocoder

Continuous DNN-TTS

Installation

Run demo

1. Check for missing packages

2. Setting up

3. Prepare config files

4. Train duration model

5. Train acoustic model

6. Synthesize speech

Test TTS demo with continuous vocoder

Contact Us

Citation

About

Licenses found

Uh oh!

Releases

Packages

Languages

License

Licenses found

malradhi/merlin

Folders and files

Latest commit

History

Repository files navigation

DNN-TTS-ContVoc: Fully Text-To-Speech Demo using Continuous Vocoder

Continuous DNN-TTS

Installation

Run demo

1. Check for missing packages

2. Setting up

3. Prepare config files

4. Train duration model

5. Train acoustic model

6. Synthesize speech

Test TTS demo with continuous vocoder

Contact Us

Citation

About

Topics

Resources

License

Licenses found

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages