Viet-Anh on Software Logo

What is: Tacotron?

SourceTacotron: Towards End-to-End Speech Synthesis
Year2000
Data SourceCC BY-SA - https://paperswithcode.com

Tacotron is an end-to-end generative text-to-speech model that takes a character sequence as input and outputs the corresponding spectrogram. The backbone of Tacotron is a seq2seq model with attention. The Figure depicts the model, which includes an encoder, an attention-based decoder, and a post-processing net. At a high-level, the model takes characters as input and produces spectrogram frames, which are then converted to waveforms.