Text to posterier /acoustic features?

11 views
Skip to first unread message

Eliezer Zimble

unread,
Mar 13, 2024, 5:52:23 PMMar 13
to kaldi-help
Hi,

I'm a student using Kaldi for a course project. The ultimate goal is to build a TTS system based off of LPC coefficients. I have been working off of the Tedlium recipe adapted to be using LPC coefficients, and have trained an encoder-decoder network (of the Tedlium configuration forward and reversed). I am hoping to add a frontend that will take text  in and feed it into the decoder, and then attach a vocoder to turn the LPC coefficients into audio.

I'm working now on creating a  front-end that will take text as input and output either raw features or  posterier pdf (ideally to match the output of the Tedlium configuration). I was wondering if this is possible and if you might have suggestions for how to approach this?

I wasn't sure if there might be a way to use the alignment information? I see some scripts to convert text to phoneme sequence but wasn't sure how to go from there.

Any insight or advice in general would be greatly appreciated.

Thanks so much,
Eliezer


Reply all
Reply to author
Forward
0 new messages