How many words are there in your language? How many of them do you know? Can you find all those words in a dictionary? You never worry about these things while communicating with people around you in daily life. As speech technology researchers, we had to probe into this, while trying to make computers recognize human speech in our native language, Malayalam. Speech recognizer is basically a computer application that can convert spoken language into textual form. The problem before us was how many words we will have to teach computers, so that they can do a close to human performance to recognize speech in Malayalam.
A system that learns the words in a language and their chances of being spoken in the context of some other word sequence is called a language model. The machine also needs a list of all words that we want it to recognize. This list is a dictionary that describes how each word is to be pronounced, as a sequence of phonemes. Technically that dictionary is called a phonetic lexicon.The popular English phonetic lexicon, CMUDict, prepared by Carnegie Mellon University contains less than 1.5 lakh words. That is far less than the number of unique words we found in Malayalam Wikipedia articles. So how large would be the phonetic lexicon for Malayalam? By theory there is no limit to which Malayalam words can undergo inflections and agglutinations. This leads to a possibility of infinite vocabulary. But practically it is not easy to have an infinitely long phonetic lexicon. So what could be an alternative?
If we can we incorporate the linguistic trick of word fusion into computers, it would equip them to have a better language model. It enables them to create a phonetic lexicon as and when required. This is what we are currently working on, the morphology aware speech recognition system. This research direction is relevant for other morphologically rich Indian languages as well. It will eventually enable you to get automatic transcription while watching your favourite videos, type text using voice and even talk with your digital assistant, all in your native language. We hope to give our computers and smartphones a bit more humane touch by making them recognize our native spoken languages.
Manohar Performing Arts of Canada Inc. (incorporated in 1993) is a Winnipeg-based registered non-profit dance theatre company that creates classical Natyasastra-based dance and drama in a Canadian context, integrating bharatanatyam and kathak into the Canadian dance landscape. Our mandate is to use the languages and motifs of classical Indian dance to create and present traditional and contemporary choreographies that speak in an accessible, authentic Canadian voice. Our mission is to support dancers and diasporic knowledge-keepers in their narrative, expressional, and movement research; to use music, spoken word, myth, and classical movement to tell stories; and to create a space for these universal stories and experiences within the Canadian cultural mosaic. Our vision is to position Manohar as a home for unique choreography, intriguing narratives, and high-quality production values, both live and online. We foster a safe creative space for dance collaboration and serve our artistic community as a resource on many aspects of South Asian dance, music, heritage, and culture.
Was Rammanohar Lohia a Hindi chauvinist after all? Sudhanva Deshpande's response to Yogendra Yadav goes in search of Lohia's writings on language and finds them supporting Hindi parochialism. Yogendra Yadav responds that opposition to English was the core of Lohia's stand on language and he asked questions which may still be relevant.
dd2b598166