System training of several training corpuses

Skip to first unread message


Jul 21, 2015, 11:28:52 AM7/21/15
I have two speech corpus.
Each corpus has its own acoustic characteristics:
- DB-1 - corpus with spontaneous speech, 30 hours of speech;
- DB-2 - corpus SPEECHDAT, 30 hours of speech.

I did some experiments:

Experiment 1:

Train-part: DB-2 (90% of the corpus)
CV-part:    DB-2 (10%)
Test-part:  DB-1 (10%)

Get the results of correct=35%. (as expected)

Experiment 2:

Train-part: DB-1 (80%)
CV-part:    DB-1 (10%)
Test-part:  DB-1 (10%)

Get the results of correct=60%.

Experiment 3:

Train-part: DB-1 (80%) + DB-2 (100%)
CV-part:    DB-1 (10%)
Test-part:  DB-1 (10%)

Get the results of correct=55%.

I do not understand why when I add training data (DB-2) (Experiment 3) is not going to improve recognition.
Prompt how to properly train your system, having the presence of several training corpuses?

Petr Schwarz

Jul 28, 2015, 5:24:19 PM7/28/15
Hi Roman, there may be many reasons.

1) number of parameters in the neural networks
2) how the data is randomized (the file lists should be joint together
and randomized)
3) learning rate (should be appropriate to your set size, but if the set
is randomized, it does not have such impact)
4) what normalization is used

From my experience, the utterances in SpeechDat are very short and
contain lot of silence in comparison to some conversational recordings.
If you do the sentence mean and variance normalization, the melbank
features after normalization are shifted. So it may be good to look at this.


Dne 21.7.2015 v 15:10 Roman napsal(a):
> --
> ---
> You received this message because you are subscribed to the Google
> Groups "phnrec" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to
> <>.
> For more options, visit

Reply all
Reply to author
0 new messages