Building Text to Speech or Speech to text applications using Common Voice Tamil Data

16 views
Skip to first unread message

Shrinivasan T

unread,
Jul 14, 2021, 2:08:25 AM7/14/21
to indicnlp
Hello all,

At Mozilla Common Voice, we are creating dataset with text and relevant audio data.

We can download the dataset here
https://commonvoice.mozilla.org/ta/datasets

It has currently 14 hours of data for tamil .

Is this enough to train any model for TTS or STT models for tamil?

Try to build models with the current data and explore the results.

There is a existing Speech to text demo with this data here
https://huggingface.co/Rajaram1996/wav2vec2-large-xlsr-53-tamil

Can we get a text to speech version like the same above?


Created an issue for the same here to track the progress.

https://github.com/KaniyamFoundation/ProjectIdeas/issues/164



Shrini



Reply all
Reply to author
Forward
0 new messages