Google Groups Home
Help | Sign in
Message from discussion What are Best Practices for Collecting Speech for a Free GPL Speech Corpus?
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
kendmacl...@gmail.com  
View profile
 More options Jan 21 2007, 3:48 pm
Newsgroups: comp.speech.research
From: kendmacl...@gmail.com
Date: 21 Jan 2007 12:48:35 -0800
Local: Sun, Jan 21 2007 3:48 pm
Subject: What are Best Practices for Collecting Speech for a Free GPL Speech Corpus?
Hi,

I am the admin for the VoxForge project.  We are collecting user
submitted speech for incorporation into a GPL Acoustic Model ('AM').
Currently we have a Julius/HTK AM being created daily, incorporating
newly submitted audio on a nightly basis.

I am confused as to which approach to take in the creation of the
VoxForge speech corpora.   Up until now, we have been asking users to
submit 'clean' speech - i.e. record their submission to ensure that all
noise (i.e. non-speech noise such as echo, hiss, ...) is kept to an
absolute minimum.  One guy (very ingeniously I thought) records his
submissions in his closet or in his car!

But some people, whose opinions I respect, say that I should not be
collecting clean speech, but collecting speech in its 'natural
environment', warts and all, with echo and hiss and all that (but
avoiding other background noise such as people talking or radios or
TVs, ...).   On some submissions, the hiss is very noticeable.

What confuses me is that some speech recognition microphones are sold
with built-in echo and noise cancellation, and the marketing says that
this improves a (commercial)  speech recognizer's performance.  This
indicates to me that I should be collecting clean speech, and then use
a noise reduction and echo cancellation front-end on the speech
recognizer, because that is what commercial speech recognition engines
seem to be doing.

And further, if clean speech is required, should I be using noise
reduction software on the submitted audio (such as the submission with
very pronounced hiss).  My attempts at noise reduction have not been
successful, with the resulting 'musical noise' (the low level sound
that replaces the removed noise) giving me very poor recognition
results.

I was wondering what your thoughts on this might be,

thanks for your time,

Ken MacLean

--
http://www.voxforge.org


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.

Create a group - Google Groups - Google Home - Terms of Service - Privacy Policy
©2008 Google