Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

MLLR adaptation in HTK

144 views
Skip to first unread message

sd

unread,
Apr 20, 2005, 9:08:34 AM4/20/05
to
Hi all
i'm currently trying to run MLLR in HTK using HEAdapt. after trawling
through the hand book i still can't answer this question?

Why does one load the stats file, generated by HERest, when generating
the MMF for use in HEAdapt?

according to the handbook the regression tree is generated using HHEd,
specifing the number of terminal nodes and using this to split the data
using euclidean distance measure into these node. So where does it take
into account the results of the stats file

thanks
Shona

Alastair James

unread,
Apr 21, 2005, 6:54:39 AM4/21/05
to
Hi there!

I am a bit confused by what you mean by 'stats' file. In my mind it has
two possible meanings:

a) The occupation stats generated by the -s option of HERest
b) The actual models produced (MMF) by HERest

Which one do you mean?

Cheers

Alastair

sd

unread,
Apr 21, 2005, 9:46:42 AM4/21/05
to
Hi Alastair
sorry i mean that generated my the -s option of HERest, so the
occupation statistics of the model set

thanks
Shona

Alastair James

unread,
Apr 21, 2005, 12:15:59 PM4/21/05
to
Hmmm...

I cant actually see where it says you need to pass the -s stats file to
HEAdapt! As far as I can see there is no way of passing it? Which flag
are you using?

Alastair

sd

unread,
Apr 22, 2005, 5:11:29 AM4/22/05
to
sorry i haven't been explaining myself well, when generating the
hmmdefs to use in HEAdapt, you must use HHEd which needs to upload the
'stats' file that was generated on the last iteration of HERest. But
the explanation for HHEd for this task meerly states that the
regresssion tree is generated using euclidean distance measures from
node means and variances but dosen't say HOW it uses the occupancy
information from the 'stats' file

Shona

Alastair James

unread,
Apr 22, 2005, 5:17:38 AM4/22/05
to
Ok, right I see what you mean now.

I think the answer is this:

The regression tree created by HHEd stores the occupation information
for each node, this is taken from the 'stats' file that was created by
HERest. The regression tree *just stores* this information for each node.

The reason this is stored in the regression tree is that HEAdapt will
need the occupation information to decide how far down the regression
tree to go. See the -m option in HEAdapt.

Does this make sense?

Alastair James

sd

unread,
Apr 22, 2005, 6:59:24 AM4/22/05
to
Alastair
i totally agree with you and this is how i thought it worked but when
you read more into the handbook, it says the regression tree is
generated by computing means and variances at each node then splitting
them using euclidian distances(section 9.1.2), so in the new HMM
generated by HHEd have the regression tree that contains the number of
the terminal node and the occupancy.BUT my impression from the hand
book is that these occupancys are from the the process i've just
described or am i totally off course here?
hope i 'm not getting annoying ;-)

S

Alastair James

unread,
Apr 22, 2005, 8:47:22 AM4/22/05
to
Umm... Sorry, I dont really see your point!

Yes, HHEd created the regression tree using the splitting method.
However, HHEd cannot work out the occupany information on its own (it
has no MFCCs), and therefore needs to load this information from
somewhere. This is where the HERest -s stats file is used?

Make sense?

Alastair

sd

unread,
Apr 22, 2005, 11:42:33 AM4/22/05
to
sorry no
isn't the occupancy a result of the splitting not the previously.
my understanding (which i'm starting to doubt) is that HHEd reads in
the models and works out NEW occupancies based on the splitting, which
is in turn based on the distance of means and variances.
i'm probably being very stupid but i can't see how the previous
occupancies reflect in this new regression tree

Shona

Alastair James

unread,
Apr 22, 2005, 11:51:13 AM4/22/05
to
No!

The occupancy is the result of TRAINING. The occupancy basically says
how much data is available for that state. This MUST be worked out by
aligning all the training data (MFCCs) to the models. This is done in
HERest.

You cant work out occupancies in HHEd on its own.... It does not have
any MFCC files.

Say I have 3 states in a HMM...

From HERest:

State Occupancy
-----------------
S1 o1
S2 o2
S3 o3

Then HHed builds the following regression tree:

N1 ----------- S1
|
|------ N2 ------ S2
|
|-------- S3

(I hope you can see that!)

I.e. S1 is directly below N1. N2 is directly below N1. S2 and S3 are
below N2.

The occupancy is worked out as:

Node Occupancy
-----------------
N1 o1 + o2 +o3 (as all states are below it)
N2 o2 + o2 (as stats 2 and 3 are below it)
S1 o1
S2 o2
S3 o3

So you can see that this process still needs ORIGINAL occupancy infor
from HEREST!!!

Regards

Alastair

Alastair James

unread,
Apr 22, 2005, 11:56:55 AM4/22/05
to
Sorry, the diagram should be (view in monospaced font)


N1 ----------- S1
|
|--------- N2 ---------- S2
|
|-------- S3

Al

sd

unread,
Apr 22, 2005, 12:25:12 PM4/22/05
to
ok i get it now, thank you very much for all your help
and sorry for being soo dense

Shona

Alastair James

unread,
Apr 22, 2005, 12:39:33 PM4/22/05
to
Ha ha, thats ok!

HTK is not the most easy to learn bit of software!

Alastair

sd

unread,
Apr 25, 2005, 7:27:29 AM4/25/05
to
hi Alastair
don't know if your stil there but just thought of another question for
you.....
why???
why do we need the occupancies of the training data, wouldn't the
occupancies of the adaptation data be more useful here?

Shona

Alastair James

unread,
Apr 25, 2005, 12:17:17 PM4/25/05
to
Good question...

It does not really go into this in the HTK book. I think its because you
need to know both sets of occupancies. You need to know the occupancy of
the training data to see if the occupancy of the adaption data is
statistically significant.

Obviously, the adaption set of occupancies can be computed at run-time
by HEAdapt, but the training ones need to be pre-computed.

Perhaps one of the Cambridge guys could confirm this?

Alastair

sd

unread,
Apr 26, 2005, 4:10:19 AM4/26/05
to
that makes sense, as much as anything in HTK ;-)
thanks again

Shona

ymhm...@gmail.com

unread,
Apr 10, 2016, 7:35:13 AM4/10/16
to
HHEd -A -D -T 1 -H hmm15/macros -H hmm15/hmmdefs -m hmm16 regtree.hed tiedlist

when I used the above command to generate macros and hmmdefs in hmm16 as mentioned in the next link:
http://www.voxforge.org/home/dev/acousticmodels/linux/adapt/htkjulius/adapt/step-5
but it generates rtree.base and rtree.tree.
Could you help me ?!

> thanks
> Shona
0 new messages