Can Kaldi decode in the JSGFGrammar way like sphinx?

873 views
Skip to first unread message

xinq...@gmail.com

unread,
Sep 23, 2016, 1:44:33 AM9/23/16
to kaldi-help
Hi all,

    I wanna know can kaldi decode in the JSGFGrammar way like sphinx?

#JSGF V1.0;
/**
 * JSGF Grammar for Hello World example
 */
grammar hello;
public <greet> = (good morning | hello) ( bhiksha | evandro | paul | philip | rita | will );


Xin.q.
Many thanks

Danijel Korzinek

unread,
Sep 23, 2016, 1:53:24 AM9/23/16
to kaldi-help
Yes, it can, although I'm not sure if there are any public ABNF/GRXML/JSFG to FST compilers out there.

Check out this page and simple replace the G part with your grammar instead of an N-gram model: http://vpanayotov.blogspot.com/2012/06/kaldi-decoding-graph-construction.html

xinq...@gmail.com

unread,
Sep 23, 2016, 1:58:10 AM9/23/16
to kaldi-help
so you mean that I need instead the G.fst with JSFG(FST format), than generate the lattice with the JSFG.fst?



在 2016年9月23日星期五 UTC+8下午1:53:24,Danijel Korzinek写道:

Danijel Korzinek

unread,
Sep 23, 2016, 2:50:34 AM9/23/16
to kaldi-help
You can try and make such a grammar by hand. The FST format is pretty simple. Each line has 4 columns:
1. start state
2. end state
3. word
4. same as 3. 
(5. optionally you can also assign a weight on the 5th column)

The reason 3 and 4 are the same is because G is supposed to be an "acceptor" (an FSA), rather than a "transducer", which can have a different input and a different output.

The grammar you wrote above would look something like this:

1 2 good good
2 3 morning morning
1 3 hello hello
3 4 bhiksha bhiksha
3 4 evandro evandro
3 4 paul paul
3 4 philip philip
3 4 rita rita
3 4 wil wil
4

The last line is simply the end state. Now, you also need to replace the words with some word ids and create a separate file with a list of mapping of words to their ids (check out words.txt). You can read more about FSTs used in Kaldi on this website: http://www.openfst.org/ (I recommend checking out their examples and tutorials).

Danijel Korzinek

unread,
Sep 23, 2016, 2:51:52 AM9/23/16
to kaldi-help
Oh and I forgot. When you're done making that file as I wrote above, you need to use "fstcompile" to convert it from text to the binary G.fst format that can be incorporated into HCLG.fst and used for decoding.

xinq...@gmail.com

unread,
Sep 23, 2016, 3:22:09 AM9/23/16
to kaldi-help
I am still comfused with the state.

If my grammar for a sentence is :  This is my computer.
Is it:
1 2 this this
2 3 is is
3 4 my my
4 5 computer
5

在 2016年9月23日星期五 UTC+8下午2:51:52,Danijel Korzinek写道:

Sunit Sivasankaran

unread,
Sep 23, 2016, 3:53:31 AM9/23/16
to kaldi...@googlegroups.com

> Yes, it can, although I'm not sure if there are any public
> ABNF/GRXML/JSFG to FST compilers out there.
>

There is a sphinx tool called "sphinx_jsgf2fsg" which can be used to
convert jsfg to fsm. You can then use fst tools to create a G.fst

Regards,
Sunit

xinq...@gmail.com

unread,
Sep 23, 2016, 4:00:10 AM9/23/16
to kaldi-help, sunit.siv...@inria.fr
many thanks. I will hava a try of it.

在 2016年9月23日星期五 UTC+8下午3:53:31,Sunit Sivasankaran写道:

Danijel Korzinek

unread,
Sep 23, 2016, 6:13:51 AM9/23/16
to kaldi-help
Yes, that is correct.

Daniel Povey

unread,
Sep 23, 2016, 3:51:16 PM9/23/16
to kaldi-help, Axel Horndasch
Cc'ing Axel Horndasch as he has worked with related things.

It would be useful to add to Kaldi the ability to easily work with
these types of grammars, and integrate them easily into statistical
language models.

For everyone's info: the JSGF grammar format
(http://www.w3.org/TR/jsgf/#20071) seems to look a bit like BNF format
for specifying programming languages, except with optional weights.

If someone wants to come up with a proposal as to the best way to do
this, next steps, etc., that would be great, since I don't have time
to think about this much right now.

Something to bear in mind is that *eventually* we will want to make it
possible to swap things into a pre-built graph "on the fly" (contact
lists, etc.). The fact that we're now using left-biphone models a lot
will make this easier (e.g. on-the-fly graph construction or extension
will be easier). This doesn't have to be part of the first steps, but
we'll eventually want to do this.


Dan
> --
> You received this message because you are subscribed to the Google Groups
> "kaldi-help" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to kaldi-help+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages