Namaste
Pointing to a news item which stakes the following claim-deliverable
Source: VALL-E (valle-demo.github.io) : Abstract. We introduce a language modelling approach for text to speech synthesis (TTS). Specifically, we train a neural codec language model (called VALL-E) using discrete codes derived from an off-the-shelf neural audio codec model, and regard TTS as a conditional language modelling task rather than continuous signal regression as in previous work. During the pre-training stage, we scale up the TTS training data to 60K hours of English speech which is hundreds of times larger than existing systems. VALL-E emerges in-context learning capabilities and can be used to synthesize high-quality personalized speech with only a 3-second enrolled recording of an unseen speaker as an acoustic prompt. Experiment results show that VALL-E significantly outperforms the state-of-the-art zero-shot TTS system in terms of speech naturalness and speaker similarity. In addition, we find VALL-E could preserve the speaker's emotion and acoustic environment of the acoustic prompt in synthesis.
Request: Would Samskruth scholars explore and advise the impact of this ‘Advanced A I Technology of Speech – Reporduction ( – the Mass –impact of this on language standardization should be transparent enough) for ‘ Spoken Samskruth’ (= Vyaavaharika – Sambhashana –Samskrutham) and ‘Standard Samskrutham’ ( Shista – Prayoga) ? pl. How will this impact class room Samskruth – Language- Teachers job futures ?
In other words, would a language learner need a ‘ human teacher’ or accessible ‘ paid - machine resource – service ’ ?? Is it Technology controlling language ?
Regards
BVK Sastry
--
You received this message because you are subscribed to the Google Groups "भारतीयविद्वत्परिषत्" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bvparishat+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bvparishat/0ca301d92566%24810a5450%24831efcf0%24%40gmail.com.