Word probabilities to FST

Phil

unread,

Apr 4, 2019, 5:31:21 PM4/4/19

to kaldi-help

Hi, I'm working on an experimental word-CTC acoustic model. For each frame, the model generates top-k probs:

e.g. :

t | index_0 | ... | index_k

0 75500 3550

1 . 0 . 243

2 . 0 . 300

...

T . 755 40

associated to log probabilities, where the indexes are words in an 85k-word dictionary.

I want to rescore the output of the model with a grammar-FST (G.fst). Is there a straightforward way to translate this format to an fst-format where it can be rescored and generate a shortestpath?

Daniel Povey

unread,

Apr 4, 2019, 5:38:23 PM4/4/19

to kaldi-help

Yeah, the FST (actually it would be an acceptor, hencer FSA) would have one start state and one state per frame, and you would have an arc for each word that was output, plus one for epsilon. . I assume those arcs would have costs on them, reflecting the probabilities from your model. Rescoring that by composing with G.fst would be possible. However you first need to understand the basics of FSTs first. Do the tutorial at openfst.org.

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/9f513f96-32cf-4038-a6d1-dd5cb78b1dd8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Phil

unread,

Apr 4, 2019, 5:50:39 PM4/4/19

to kaldi-help

Oh that makes perfect sense. Thank you.

On Thursday, April 4, 2019 at 2:38:23 PM UTC-7, Dan Povey wrote:

Yeah, the FST (actually it would be an acceptor, hencer FSA) would have one start state and one state per frame, and you would have an arc for each word that was output, plus one for epsilon. . I assume those arcs would have costs on them, reflecting the probabilities from your model. Rescoring that by composing with G.fst would be possible. However you first need to understand the basics of FSTs first. Do the tutorial at openfst.org.

On Thu, Apr 4, 2019 at 2:31 PM Phil <philip...@gmail.com> wrote:

Hi, I'm working on an experimental word-CTC acoustic model. For each frame, the model generates top-k probs:

e.g. :

t | index_0 | ... | index_k
0 75500 3550
1 . 0 . 243
2 . 0 . 300
...
T . 755 40

associated to log probabilities, where the indexes are words in an 85k-word dictionary.
I want to rescore the output of the model with a grammar-FST (G.fst). Is there a straightforward way to translate this format to an fst-format where it can be rescored and generate a shortestpath?

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.

To unsubscribe from this group and stop receiving emails from it, send an email to kaldi...@googlegroups.com.

Phil

unread,

Apr 4, 2019, 8:44:53 PM4/4/19

to kaldi-help

Hi Dan,

I've been getting no-output for my composition and am wondering what's wrong. Here's an example sequence from the word-ctc FST I generated:

0 1 0 0 0.992731512

1 2 0 0 0.987134516

2 3 13403 13403 0.0216468126

2 3 13801 13801 0.957984328

3 4 0 0 0.990761697

4 5 0 0 0.985004902

5 6 0 0 0.979174674

6 7 0 0 0.985455453

7 8 0 0 0.899928987

7 8 1879 1879 0.0119872307

7 8 121582 121582 0.00832993444

7 8 125717 125717 0.00510402676

7 8 148843 148843 0.0194917303

7 8 327499 327499 0.00882347766

8 9 0 0 0.996489584

9 10 0 0 0.98880744

10 11 0 0 0.99289459

11 12 0 0 0.4248586

11 12 1879 1879 0.393045187

11 12 41127 41127 0.0264167953

11 12 214354 214354 0.00775688235

11 12 267149 267149 0.0181218777

11 12 274056 274056 0.0107227825

12 13 0 0 0.998292804

13

My LM fst:

fstrandgen --select=log_prob G.fst | fstprint

0 1 234775 234775

1 2 188644 188644

2 3 375228 0

3 4 329316 329316

4 5 375228 0

5 6 375228 0

6 7 371495 371495

7 8 360801 360801

8 9 375228 0

9

However, running the following:

fstcompose test.fst G.fst

is giving me a null output. Can you see anything incorrect in what I'm doing?

Daniel Povey

unread,

Apr 4, 2019, 8:46:25 PM4/4/19

to kaldi-help

I suspect that 375228 is some kind of disambiguation symbol in your G.fst. You should remove that first, e.g. by projecting on the output.

To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.

To post to this group, send email to kaldi...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/1dd40039-0129-4c68-8bd7-b70b37e8c052%40googlegroups.com.

Phil

unread,

Apr 4, 2019, 9:23:54 PM4/4/19

to kaldi-help

Removing the disambiguation symbol fixed it.

Since there's a potential for repeated outputs from CTC, would it make sense to merge repeated outputs by composing the word-model output with a "CTC FST" e.g.

0 0 blank blank

0 1 a a

1 1 a <eps>

1 0 blank blank

0

(don't know if this compiles but the idea being that repeated outputs map back to the same word state)

Daniel Povey

unread,

Apr 4, 2019, 9:31:41 PM4/4/19

to kaldi-help

Hm, you could try that I guess. I'm not sure how common those repeated outputs are in practice.

It might make it hard to recognize repetitions of a word.

To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.

To post to this group, send email to kaldi...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/abcd4e80-08d0-4cb8-8b4e-42d45dadc233%40googlegroups.com.

Phil

unread,

Apr 5, 2019, 8:12:32 PM4/5/19

to kaldi-help

Couple of follow-up questions on this:

- is there a way to rescale the LM FST outside of Kaldi?

- are there any optimizations to be made with OpenFST for this specific problem?

Right now, I am running:

fstcompose --compose_filter=sequence in.fst G_projected.fst | fstshortestpath > out.fst, but this doesn't give super great performance.

Are the operations in lattice-lmrescore significantly faster? I was also thinking of converting the FST to a Kaldi lattice to take advantage of the optimizations in that script.

Daniel Povey

unread,

Apr 5, 2019, 10:00:19 PM4/5/19

to kaldi-help

Couple of follow-up questions on this:
- is there a way to rescale the LM FST outside of Kaldi?

Not sure what you mean by this, but probably not. If you want to scale the LM costs you can probably do it with something like

fstprint foo.fst | awk '{ if(NF>4) { $5 *= 0.5} ;print;}' | fstcompile > bar.fst

- are there any optimizations to be made with OpenFST for this specific problem?
Right now, I am running:
fstcompose --compose_filter=sequence in.fst G_projected.fst | fstshortestpath > out.fst, but this doesn't give super great performance.

Not sure if you refer to speed or WER.

To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.

To post to this group, send email to kaldi...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/24e8de4c-5792-42e1-aee8-ba013fb1a126%40googlegroups.com.

Philip Cerles

unread,

Apr 5, 2019, 10:39:20 PM4/5/19

to kaldi...@googlegroups.com

Thanks that makes sense.

Ah sorry for being unclear. I was referring to speed.

You received this message because you are subscribed to a topic in the Google Groups "kaldi-help" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/kaldi-help/LMDZxJDfrlg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to kaldi-help+...@googlegroups.com.

To post to this group, send email to kaldi...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/CAEWAuyRpOd9JCkvgmC0R3MEHhS0ON-xZpmC-z_FkoLK5y2cgoQ%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

--

Philip A. Cerles

B.A. Applied Mathematics, 2016

University of California, Berkeley

Reply all

Reply to author

Forward