Challenges with a lot of words in VocalKitTest

6 views

Skip to first unread message

Irving Ruan

unread,

Aug 22, 2010, 7:50:37 PM8/22/10

to VocalKit

Hello VK community,

I have come across VocalKit while doing research for using speech
recognition libraries in an iPhone app I am building. It's a neat
wrapped for PocketSphinx, and should be cool to used.

While perusing through the archive on this group, I have realized that
the biggest issue that most people are facing is dealing with large
dictionary files and many words to recognize. Namely, the wsj has 5k
words, and has a high error rate because it has a lot of words.
However, changing it to only include words "YES NO GOODBYE" allows the
app to recognize it >95% of the time since it's only handling three
words. While this is really cool, I'm just wondering what sort of
solutions can we implement to overcome this obvious road bump? Maybe
I'm just confused by how PocketSphinx works internally, but why would
the default 5k language model and dictionary file have very little, if
not no, accuracy with speech-to-text recognition?

VocalKit is a great wrapper for PocketSphinx, and I foresee it
becoming widely used if a lot of development is done on it. Thanks in
advance!

-I.

Brian King

unread,

Sep 20, 2010, 9:08:09 AM9/20/10

to voca...@googlegroups.com

Hey Irving,

I'm not sure I have a good answer for you with why large vocabularies are not as accurate - voice recognition is a very complex task and limiting the scope of the problem by reducing the vocabulary works well. If you are interested in improving the accuracy of pocket sphinx, you would want work with them directly. The scope of this project is largely to assist in getting iPhone developers rolling with pocket sphinx.

I'm glad you enjoy VocalKit, but it's a drop of water compared to the work that has gone into pocketsphinx!

Brian

--
You received this message because you are subscribed to the Google Groups "VocalKit" group.
To post to this group, send email to voca...@googlegroups.com.
To unsubscribe from this group, send email to vocalkit+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/vocalkit?hl=en.

Reply all

Reply to author

Forward

0 new messages