sentence confidence and nbest hypothesis

Sana Khamekhem

unread,

Nov 19, 2016, 12:47:02 PM11/19/16

to kaldi-help

Hi,

I'm planning to extract n best hypothesis of utterances, and to get the confidence score for each one (per sentence and not per word),

utt_1-1 1 0.2 0.1 waAkeAdhAlaAkeA toAaaAfaAraA.... score1

utt_1-2 1 0.2 0.1 .. .... score2

utt_1-3 1 0.2 0.1 .. .... score3

.

Is there a way to do this??

Now, I'im using this script to get nbest without scores:

gunzip -c $dir/lat.1.gz |\

lattice-to-nbest --acoustic-scale=0.0883 --n=10 --lm-scale=1.0 ark:- ark:- | \

nbest-to-ctm --precision=4 ark:- - | utils/int2sym.pl -f 5 $lang/words.txt > $dir/NBest.10.ctm || exit 1;

Daniel Povey

unread,

Nov 19, 2016, 2:23:46 PM11/19/16

to kaldi-help

It is possible to obtain posterior probabilities for each of the n-best sentences. If you use nbest-to-linear you can get the lm-cost and the acoustic cost for each nbest. If you scale down the acoustics, negate both of the costs to get logprobs, add the lm and acoustic costs and exponentiate, you'll get an unnormalized probability for each element of the n-best list. You could normalize those to sum to one. [of course, you'd compute this differently to avoid overflow in the exp.]

--
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Sana Khamekhem

unread,

Nov 21, 2016, 3:03:41 AM11/21/16

to kaldi-help, dpo...@gmail.com

I have obtained this result by doing the proposed method:

Utt LM AC (-LM)+(-AC)

AHTD3A0002_Para2_1-1	83.26607	-5840.735	5757.46893
AHTD3A0002_Para2_1-2	88.04139	-5841.306	5753.26461
AHTD3A0002_Para2_1-3	84.77756	-5837.759	5752.98144
AHTD3A0002_Para2_1-4	87.3546	-5839.219	5751.8644

But, I don't understand how can I normalize this between 0 and 1 as for other utterance I get only 4 Best or 3 Best hypothesis...

Thank you Dan for your help.

To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.

Danijel Korzinek

unread,

Nov 21, 2016, 3:28:46 AM11/21/16

to kaldi-help, dpo...@gmail.com

Divide each value by the sum of all values to get "probability" or divide by max to get a value between 0 and 1.

Or subtract min, divide by max-min and multiply by 100 - that would be kinda standard in what I've seen in other software as far as confidence values.

Sana Khamekhem

unread,

Nov 21, 2016, 3:47:31 AM11/21/16

to kaldi-help, dpo...@gmail.com

Thank you very match Danijel

Daniel Povey

unread,

Nov 21, 2016, 5:44:31 PM11/21/16

to Sana Khamekhem, kaldi-help

They are logprobs so you need to exponentiate.

I explained it in my original email to you-- read it very carefully and maybe ask someone local.

Tamir Tapuhi

unread,

May 28, 2017, 4:58:25 AM5/28/17

to kaldi-help, dpo...@gmail.com

Hi,

I've used lattice-to-nbest, and then nbest-to-linear

nbest-to-linear "ark:gunzip -c lat.1.gz|" ark,t:1.ali 'ark,t:|int2sym.pl -f 2- words.txt > 1.tra' ark,t:1.lm ark,t:1.ac

"If you use nbest-to-linear you can get the lm-cost and the acoustic cost for each nbest" - i got the lm-cost and the acoustic cost in 1.lm and 1.ac correspondingly,

I didn't understand how to combine them together in order to get the probability of a sentence in my n-best - i tried to follow what Dan suggested " If you scale down the acoustics, negate both of the costs to get logprobs, add the lm and acoustic costs and exponentiate"

but i'm not sure what is the meaning of scaling down the acoustics? I expected that adding LM+AC will give me the lowest value for the first best, and will be higher for 2-best ... and this isn't what i got.

I will appreciate any help in the matter..

On Saturday, November 19, 2016 at 9:23:46 PM UTC+2, Dan Povey wrote:

It is possible to obtain posterior probabilities for each of the n-best sentences. If you use nbest-to-linear you can get the lm-cost and the acoustic cost for each nbest. If you scale down the acoustics, negate both of the costs to get logprobs, add the lm and acoustic costs and exponentiate, you'll get an unnormalized probability for each element of the n-best list. You could normalize those to sum to one. [of course, you'd compute this differently to avoid overflow in the exp.]

On Sat, Nov 19, 2016 at 12:47 PM, Sana Khamekhem <sana.kh...@gmail.com> wrote:

Hi,
I'm planning to extract n best hypothesis of utterances, and to get the confidence score for each one (per sentence and not per word),
utt_1-1 1 0.2 0.1 waAkeAdhAlaAkeA toAaaAfaAraA.... score1
utt_1-2 1 0.2 0.1 .. .... score2

utt_1-3 1 0.2 0.1 .. Hi .... score3

.
.
.
Is there a way to do this??
Now, I'im using this script to get nbest without scores:
gunzip -c $dir/lat.1.gz |\
lattice-to-nbest --acoustic-scale=0.0883 --n=10 --lm-scale=1.0 ark:- ark:- | \
nbest-to-ctm --precision=4 ark:- - | utils/int2sym.pl -f 5 $lang/words.txt > $dir/NBest.10.ctm || exit 1;

--
You received this message because you are subscribed to the Google Groups "kaldi-help" group.

To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.

Daniel Povey

unread,

May 28, 2017, 2:17:14 PM5/28/17

to Tamir Tapuhi, kaldi-help

Firstly, from the command line it's not obvious that you ran lattice-to-nbest first, but I'll trust that lat.1.gz is the output of lattice-to-nbest.

Also, you would probably want to run lattice-scale with e.g. --acoustic-scale=0.1 before lattice-to-nbest, for this purpose -- or supply --acoustic-scale=0.1 to lattice-to-nbest, and later scale down AC by 0.1 before adding to LM.

Anyway, the LM+AC cost should be lowest for the first in the sequence, and so on. You could treat that sum as an unnormalized negated log-prob, i.e. you could negate, exponentiate, and normalize to sum to one, to get something that you could treat as a posterior.

It doesn't make sense to scale down the acoustics in this case because you didn't pass in any acoustic scaling option to lattice-to-nbest, so the order will only be correct without scaling the acoustics.

Tamir Tapuhi

unread,

May 29, 2017, 2:38:54 AM5/29/17

to kaldi-help, tmya...@gmail.com, dpo...@gmail.com

You were right, it was really only a scaling issue.

Now i get it as expected.

Many thanks Dan,

appreciate your help

Tamir Tapuhi

unread,

Jul 19, 2017, 7:49:18 AM7/19/17

to kaldi-help, tmya...@gmail.com, dpo...@gmail.com

Hi again,

going back to the same subject,

if i applied lattice scale with --acoustic-scale=0.1 before lattice-to-nbest, and then run lattice-to-nbest (without adding acoustic scale), and then nbest-to-linear - should i get Nbests sorted according to their likelihood?

On Sunday, May 28, 2017 at 9:17:14 PM UTC+3, Dan Povey wrote:

Daniel Povey

unread,

Jul 19, 2017, 12:51:25 PM7/19/17

to Tamir Tapuhi, kaldi-help

Yes, that would work. Assuming that acoustic-scale=0.1 was your scale
of interest.

Mael Primet

unread,

Jul 23, 2017, 5:28:58 PM7/23/17

to kaldi-help, tmya...@gmail.com, dpo...@gmail.com

Very interesting, I'm also wondering: would there be a natural way of using n-best and confidence score to suggest segments in a decoded sentence that might be less likely and for which there are a few good alternatives? a bit like what the iPhone does when we speak to it, it highlights segments of the sentence that it is not sure about and provide alternate hypotheses for them

would this be only figuring out that the first 2 best hypothesis differ only by a segment and highlighting it, or might there be some way to output such segments during the FST decoding ?

Daniel Povey

unread,

Jul 23, 2017, 5:32:29 PM7/23/17

to kaldi-help, Tamir Tapuhi

You could perhaps use the output of lattice-to-ctm-conf (or modify
code to use this internally), that has confidences. If you use a
phone-based language model for the unknown word, that will help you
catch situations where the lexicon itself doesn't seem to match, and
an unknown word seems to be the better match; see
tedlium/s5_r2/local/run_unk_model.sh.

> --
> Go to http://kaldi-asr.org/forums.html find out how to join
> ---

Mael Primet

unread,

Jul 23, 2017, 5:46:16 PM7/23/17

to kaldi-help, tmya...@gmail.com, dpo...@gmail.com

Thanks, I had seen the script to build the unk LM but I was not sure what it was, I think this clarifies it a bit, could you tell me if the idea I have of it is right?

- the unk LM uses a large existing LM to train a phone-level LM of transitions between phones

- it then "expands" the unk word in an existing LM with this small grammar, a bit like a class-based LM

- when we detect the unk word, we can then get the detected phonemes and get a phonetic variant of the unknown word and perhaps do a match with a larger offline dictionary

Daniel Povey

unread,

Jul 23, 2017, 5:49:27 PM7/23/17

to Mael Primet, kaldi-help, Tamir Tapuhi

Yes that sounds right, except the unk-LM is trained only on the lexicon.

Kevin

unread,

Apr 3, 2020, 1:39:37 PM4/3/20

to kaldi-help

Hello,

I'm also trying to obtain the nbest hypotheses for the purpose of lm-rescoring but I'm having some confusion on the posterior logprobs so I've been following this post. I used the following commands to generate the lm and acoustic costs (logprobs) and corresponding output:

gunzip -c $decode_dir/lat.1.gz |\

src/latbin/lattice-to-nbest --acoustic-scale=0.1 --n=10 ark:- ark:- |\

src/latbin/nbest-to-linear ark:- ark,t:$decode_dir/NBest_10/ali.1 ark,t:$decode_dir/NBest_10/tra.1 ark,t:$decode_dir/NBest_10/lmcost.1 ark,t:$decode_dir/NBest_10/accost.1

For demonstration I've created a .csv file:

utt_id,rank,accost,lmcost,(0.1*accost-lmcost)

2000_bos_7000,1,-3663.002,138.4283,-504.72850000000005

2000_bos_7000,2,-3688.515,141.0154,-509.8669

2000_bos_7000,3,-3641.403,136.6397,-500.78

2000_bos_7000,4,-3666.916,139.2268,-505.9184

2000_bos_7000,5,-3687.323,141.3045,-510.03679999999997

2000_bos_7000,6,-3665.723,139.5159,-506.0882

2000_bos_7000,7,-3639.544,137.5812,-501.53559999999993

2000_bos_7000,8,-3665.056,140.1684,-506.674

2000_bos_7000,9,-3676.495,141.3441,-508.9936

2000_bos_7000,10,-3702.008,143.9312,-514.132

Questions:

If accost and lmcost are both logprobs, is there any reason why accost is negative and and lmcost is positive? I negated lmcost accordingly, assuming that the printed format was the -logprob (from another post).
More importantly, I expected the posterior logprobs (0.1*accost-lmcost) to be sorted in descending order with the nbest ranks. Why isn't this the case?

Thanks,

Kevin

Daniel Povey

unread,

Apr 4, 2020, 8:47:44 AM4/4/20

to kaldi-help

Hello,

I'm also trying to obtain the nbest hypotheses for the purpose of lm-rescoring but I'm having some confusion on the posterior logprobs so I've been following this post. I used the following commands to generate the lm and acoustic costs (logprobs) and corresponding output:

gunzip -c $decode_dir/lat.1.gz |\
src/latbin/lattice-to-nbest --acoustic-scale=0.1 --n=10 ark:- ark:- |\
src/latbin/nbest-to-linear ark:- ark,t:$decode_dir/NBest_10/ali.1 ark,t:$decode_dir/NBest_10/tra.1 ark,t:$decode_dir/NBest_10/lmcost.1 ark,t:$decode_dir/NBest_10/accost.1

For demonstration I've created a .csv file:

utt_id,rank,accost,lmcost,(0.1*accost-lmcost)
2000_bos_7000,1,-3663.002,138.4283,-504.72850000000005
2000_bos_7000,2,-3688.515,141.0154,-509.8669
2000_bos_7000,3,-3641.403,136.6397,-500.78
2000_bos_7000,4,-3666.916,139.2268,-505.9184
2000_bos_7000,5,-3687.323,141.3045,-510.03679999999997
2000_bos_7000,6,-3665.723,139.5159,-506.0882
2000_bos_7000,7,-3639.544,137.5812,-501.53559999999993
2000_bos_7000,8,-3665.056,140.1684,-506.674
2000_bos_7000,9,-3676.495,141.3441,-508.9936
2000_bos_7000,10,-3702.008,143.9312,-514.132

Questions:
If accost and lmcost are both logprobs, is there any reason why accost is negative and and lmcost is positive? I negated lmcost accordingly, assuming that the printed format was the -logprob (from another post).

They are both negated. The acoustic cost is a log-likelihood, not a log-probability, so it can have either sign.

More importantly, I expected the posterior logprobs (0.1*accost-lmcost) to be sorted in descending order with the nbest ranks. Why isn't this the case?

Probably because you got the of accost wrong.

To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/d590a9db-7676-4b03-b2f5-98a02e2b8a95%40googlegroups.com.

Naveen Gabriel

unread,

May 6, 2020, 10:55:43 AM5/6/20

to kaldi-help

Hi Dan

The scaling down of acoustic model. I was wondering why it is required.

Daniel Povey

unread,

May 7, 2020, 1:14:36 AM5/7/20

to kaldi-help

Read the HTK Book, that might help. It's about the modeling assumptions not being right (HMM assumes no correlations given HMM state, but actually they are correlated).

--

Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/ba9202f4-eb5f-42d4-a2cb-c478b639960c%40googlegroups.com.

Naveen Gabriel

unread,

May 7, 2020, 7:36:34 AM5/7/20

to kaldi...@googlegroups.com

I had read jurafsy book where it mentioned that the language model is scaled up.

So should assume that the effect is same when acoustic model is scaled down ?

To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/CAEWAuyRthTkwGoyGoo3wzE9KsurshPBNFdmUGoy9MiB4EUAnhw%40mail.gmail.com.

Daniel Povey

unread,

May 7, 2020, 7:55:18 AM5/7/20

to kaldi-help

Yes

To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/CAHuiF6B-7j_G698THH4L-H7-z-4BAaOF6V%3DO0-LKGd1LaNQteQ%40mail.gmail.com.

Naveen Gabriel

unread,

May 8, 2020, 2:25:09 AM5/8/20

to kaldi-help

HI Dan

I executed following command to get the confidence score for each 10 best utterance

gunzip -c $decode_dir/lat.1.gz | $KALDI_ROOT/src/latbin/lattice-scale --acoustic-scale=0.1 ark:- ark:-| $KALDI_ROOT/src/latbin/lattice-to-nbest --n=10 ark:- ark:- |$KALDI_ROOT/src/latbin/lattice-confidence ark:- ark,t:$decode_dir/NBest_10/lat

I got output as :

som01_015a2dcf-2f74-48f1-9e75-827599719854_AudioAttachment-1 1e+10

som01_015a2dcf-2f74-48f1-9e75-827599719854_AudioAttachment-2 1e+10

som01_015a2dcf-2f74-48f1-9e75-827599719854_AudioAttachment-3 1e+10

som01_015a2dcf-2f74-48f1-9e75-827599719854_AudioAttachment-4 1e+10

som01_015a2dcf-2f74-48f1-9e75-827599719854_AudioAttachment-5 1e+10

som01_015a2dcf-2f74-48f1-9e75-827599719854_AudioAttachment-6 1e+10

som01_015a2dcf-2f74-48f1-9e75-827599719854_AudioAttachment-7 1e+10

som01_015a2dcf-2f74-48f1-9e75-827599719854_AudioAttachment-8 1e+10

som01_015a2dcf-2f74-48f1-9e75-827599719854_AudioAttachment-9 1e+10

So executed the command as above to get LM and acoustic cost. Even my LM cost and acoustic cost is coming out as:

Utterance	LM Cost	Accost
som01_015a2dcf-2f74-48f1-9e75-827599719854_AudioAttachment-1	24.03416	1884.988
som01_015a2dcf-2f74-48f1-9e75-827599719854_AudioAttachment-2	34.41694	1874.884
som01_015a2dcf-2f74-48f1-9e75-827599719854_AudioAttachment-3	28.41609	1881.671
som01_015a2dcf-2f74-48f1-9e75-827599719854_AudioAttachment-4	38.07364	1872.384
som01_015a2dcf-2f74-48f1-9e75-827599719854_AudioAttachment-5	30.7676	1879.962
som01_015a2dcf-2f74-48f1-9e75-827599719854_AudioAttachment-6	34.76073	1876.355

Assuming these are in log. I cannot exponentiate them because the AM+LM will be huge. How can I then get the confidence score ?

On Thursday, May 7, 2020 at 1:55:18 PM UTC+2, Dan Povey wrote:

Yes

On Thu, May 7, 2020 at 7:36 PM Naveen Gabriel <naveen...@gmail.com> wrote:

I had read jurafsy book where it mentioned that the language model is scaled up.
So should assume that the effect is same when acoustic model is scaled down ?

On Thu, May 7, 2020 at 7:14 AM Daniel Povey <dpo...@gmail.com> wrote:

Read the HTK Book, that might help. It's about the modeling assumptions not being right (HMM assumes no correlations given HMM state, but actually they are correlated).

On Wed, May 6, 2020 at 10:55 PM Naveen Gabriel <naveen...@gmail.com> wrote:

Hi Dan

The scaling down of acoustic model. I was wondering why it is required.

On Saturday, November 19, 2016 at 6:47:02 PM UTC+1, Sana Khamekhem wrote:
Hi,
I'm planning to extract n best hypothesis of utterances, and to get the confidence score for each one (per sentence and not per word),
utt_1-1 1 0.2 0.1 waAkeAdhAlaAkeA toAaaAfaAraA.... score1
utt_1-2 1 0.2 0.1 .. .... score2
utt_1-3 1 0.2 0.1 .. .... score3
.
.
.
Is there a way to do this??
Now, I'im using this script to get nbest without scores:
gunzip -c $dir/lat.1.gz |\
lattice-to-nbest --acoustic-scale=0.0883 --n=10 --lm-scale=1.0 ark:- ark:- | \
nbest-to-ctm --precision=4 ark:- - | utils/int2sym.pl -f 5 $lang/words.txt > $dir/NBest.10.ctm || exit 1;

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.

To unsubscribe from this group and stop receiving emails from it, send an email to kaldi...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/ba9202f4-eb5f-42d4-a2cb-c478b639960c%40googlegroups.com.

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.

To unsubscribe from this group and stop receiving emails from it, send an email to kaldi...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/CAEWAuyRthTkwGoyGoo3wzE9KsurshPBNFdmUGoy9MiB4EUAnhw%40mail.gmail.com.

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.

To unsubscribe from this group and stop receiving emails from it, send an email to kaldi...@googlegroups.com.

Daniel Povey

unread,

May 8, 2020, 2:34:36 AM5/8/20

to kaldi-help

Look at the usage message of `lattice-confidence`:

Compute sentence-level lattice confidence measures for each lattice.
The output is simly the difference between the total costs of the best and
second-best paths in the lattice (or a very large value if the lattice
had only one path). Caution: this is not necessarily a very good confidence
measure. You almost certainly want to specify the acoustic scale.
If the input is a state-level lattice, you need to specify
--read-compact-lattice=false, or the confidences will be very small
(and wrong). You can get word-level confidence info from lattice-mbr-decode.

Those n-best list entries contain just a single path so they would be bound to give you an infinite value.

To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/88bcc292-6d5c-470c-9bf6-dfece0d991e4%40googlegroups.com.

Naveen Gabriel

unread,

Aug 1, 2020, 7:10:45 AM8/1/20

to kaldi-help

Hi Daniel

I was checking on the equation of acoustic model scaling factor.

I could not find anywhere so i assume the equation should be as below :

Could you please tell me if this is right?

On Thursday, May 7, 2020 at 1:55:18 PM UTC+2, Dan Povey wrote:

Yes

On Thu, May 7, 2020 at 7:36 PM Naveen Gabriel <naveen...@gmail.com> wrote:

I had read jurafsy book where it mentioned that the language model is scaled up.
So should assume that the effect is same when acoustic model is scaled down ?

On Thu, May 7, 2020 at 7:14 AM Daniel Povey <dpo...@gmail.com> wrote:

Read the HTK Book, that might help. It's about the modeling assumptions not being right (HMM assumes no correlations given HMM state, but actually they are correlated).

On Wed, May 6, 2020 at 10:55 PM Naveen Gabriel <naveen...@gmail.com> wrote:

Hi Dan

The scaling down of acoustic model. I was wondering why it is required.

On Saturday, November 19, 2016 at 6:47:02 PM UTC+1, Sana Khamekhem wrote:
Hi,
I'm planning to extract n best hypothesis of utterances, and to get the confidence score for each one (per sentence and not per word),
utt_1-1 1 0.2 0.1 waAkeAdhAlaAkeA toAaaAfaAraA.... score1
utt_1-2 1 0.2 0.1 .. .... score2
utt_1-3 1 0.2 0.1 .. .... score3
.
.
.
Is there a way to do this??
Now, I'im using this script to get nbest without scores:
gunzip -c $dir/lat.1.gz |\
lattice-to-nbest --acoustic-scale=0.0883 --n=10 --lm-scale=1.0 ark:- ark:- | \
nbest-to-ctm --precision=4 ark:- - | utils/int2sym.pl -f 5 $lang/words.txt > $dir/NBest.10.ctm || exit 1;

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.

To unsubscribe from this group and stop receiving emails from it, send an email to kaldi...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/ba9202f4-eb5f-42d4-a2cb-c478b639960c%40googlegroups.com.

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.

To unsubscribe from this group and stop receiving emails from it, send an email to kaldi...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/CAEWAuyRthTkwGoyGoo3wzE9KsurshPBNFdmUGoy9MiB4EUAnhw%40mail.gmail.com.

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.

To unsubscribe from this group and stop receiving emails from it, send an email to kaldi...@googlegroups.com.

Daniel Povey

unread,

Aug 2, 2020, 2:23:45 AM8/2/20

to kaldi-help

Yes except the acoustic scale would actually be what you wrote as 1/N, e.g. we'd describe the acoustic scale as 0.1.

10 would be described as the LM-scale (it's the ratio that matters for decoding purposes).

To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/02dd15de-00e9-47dd-b324-53ee6b423ee7o%40googlegroups.com.

bofeng huang

unread,

Sep 28, 2020, 9:22:48 AM9/28/20

to kaldi-help

Thanks for this explication.

I'm wondering though when I have very large log-values, how could I normalize to avoid the problem of overflow with expotentials pls?

For example, I have following "log-probabilities" for the 5 best answers :

ac cost lm cost (-lm)+(-ac)

-2577.885 50.39511 2527.4898900000003

-2574.321 50.39512 2523.92588

-2573.802 50.39512 2523.40688

-2573.416 50.39515 2523.0208500000003

-2573.365 50.39512 2522.9698799999996

(I dont' really want to modify the "acoustic-scale")

Thanks in advance.

Daniel Povey

unread,

Sep 28, 2020, 11:44:27 AM9/28/20

to kaldi-help

We'd normally use some kind of LogAdd function that only looks at differences.

You'd never comput exp of that.

--
Go to http://kaldi-asr.org/forums.html to find out how to join the kaldi-help group
---

You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/e2734da6-17f3-45b5-9f65-6789bb531e2dn%40googlegroups.com.

bofeng huang

unread,

Sep 28, 2020, 12:21:10 PM9/28/20

to kaldi-help

I will then concentrate on word level confidence by mbr.

Thanks Dan for you response.

Reply all

Reply to author

Forward