Regarding the MongoDB Related Questions

88 views
Skip to first unread message

mister.s...@berkeley.edu

unread,
Oct 26, 2016, 12:55:28 AM10/26/16
to Lucida Users
Thank you for sparing a time to read this email in the midst of pressing affairs. Our team currently work on using LLC training data to improve the accuracy of speech to text component. We have found your research paper online and are really interested in the modifications you made on OpenEphyra. We are wondering if you and your team have used MongoDB to store local data and where we should put our vocal training data in Lucida to improve the speech recognition performance. If so, should we obtain authentication and follow a certain format to stylize our MongoDB files? In addition, we are trying to add more descriptions in the output answer. Could you give us some suggestions to change the answer selection process to personalize the answer output. We are appreciated about your time to read this message as well as any assistance you can offer.

Yunsheng Bai

unread,
Oct 27, 2016, 8:44:40 PM10/27/16
to Lucida Users
Hi,

First of all, sorry about the delay. I personally have been involved in several exams and research group meetings for the past few week. 

Second, are you referring to the Sirius paper? I personally have investigated the reasons for low accuracy of OpenEphyra, but it is a short report for a university course.

Third, we rely on kaldi for ASR: we choose DNN models based on fisher, but are aware that there are other options. To personalize the ASR results, you could look into how online decoding is done in the current implementation, and figure out how personalized models could be introduced. I suggest this as a starting point, but if you are using other implementations instead of kaldi, please let me know, either through email or post below.

Finally, we are using MongoDB to store data for OpenEphyra (knowledge base), command center (user information), and image matching. If you choose to integrate Lucida into your work, you only need to make sure the formats do not conflict with the existing ones shown above. Otherwise, the formats are of your choice. Regarding MondoDB, I find it very easy to work with due to its "NoSQL" nature. However, whether it is a perfect fit for your problem requires further thinking. 

Thanks!

Yunsheng Bai
Reply all
Reply to author
Forward
0 new messages