visualisation

3 views
Skip to first unread message

Laci

unread,
Mar 18, 2010, 8:27:54 AM3/18/10
to Joshua technical support
Dear Everyone,

I would like to use jung visualisation.

I run the decoder these parameters:

#nbest config
use_unique_nbest=true
use_tree_nbest=true
include_align_index=true
add_combined_cost=true
top_n=300

and called the visualisation tool like:

java -Dfile.encoding=utf8 \
-cp $JOSHUA_HOME/bin:$JOSHUA_HOME/lib/collections-
generic-4.01.jar:$JOSHUA_HOME/lib/jung-algorithms-2.0.jar:$JOSHUA_HOME/
lib/jung-api-2.0.jar:$JOSHUA_HOME/lib/jung-graph-impl-2.0.jar:
$JOSHUA_HOME/lib/jung-visualization-2.0.jar \
joshua.ui.tree_visualizer.browser.Browser $directory/test/
test_translate.en $directory/test/test_translate.hu $directory/
forditas/forditas_joshua.1best&

where
$directory/test/test_translate.en is my source side test file
$directory/test/test_translate.hu is the reference translation of my
test set
$directory/forditas/forditas_joshua.1best is the resulte of the joshua
decoder

but I got the previous error message:

Exception in thread "main" java.lang.NumberFormatException: For input
string: "(ROOT{0-24} ([S]{0-23} ([S]{0-19} ([S]{0-17} ([S]{0-12} ([S]
{0-11} ([S]{0-2} ([X]{0-2} a ([X]{1-2} fifty-seven-when_OOV))) ([X]
{2-11} ([X]{2-4} volt) összefüggő ([X]{5-11} ([X]{5-7} szerette) ,
hogy ([X]{10-11} henrietta-was_OOV)))) ([X]{11-12} közben)) ([X]
{12-17} ([X]{12-13} a) ([X]{14-16} magas ([X]{15-16} redshifts_OOV))
és)) ([X]{17-19} arnold)) ([X]{19-23} ([X]{19-20} csapodárságaival)
([X]{21-22} doris) .)))"
at
java.lang.NumberFormatException.forInputString(NumberFormatException.java:
65)
at java.lang.Integer.parseInt(Integer.java:470)
at java.lang.Integer.parseInt(Integer.java:514)
at
joshua.ui.tree_visualizer.browser.TranslationInfoList.addNBestFile(TranslationInfoList.java:
109)
at joshua.ui.tree_visualizer.browser.Browser.main(Browser.java:100)

Can someone tell me something about this jung tools and what I make
wrong.

Thanks
Laci

zhifei li

unread,
Mar 18, 2010, 9:21:39 AM3/18/10
to joshua_t...@googlegroups.com
Can you try to set include_align_index=false?

Cheers
Zhifei


--
You received this message because you are subscribed to the Google Groups "Joshua technical support" group.
To post to this group, send email to joshua_t...@googlegroups.com.
To unsubscribe from this group, send email to joshua_technic...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/joshua_technical?hl=en.


Laci

unread,
Mar 18, 2010, 11:26:55 AM3/18/10
to Joshua technical support
Hi

With include_align_index=false I got the same error.

Exception in thread "main" java.lang.NumberFormatException: For input

string: "(ROOT ([S] ([S] ([S] ([S] ([S] ([S] ([X] a ([X] fifty-seven-
when_OOV))) ([X] ([X] volt) összefüggő ([X] ([X] szerette) , hogy ([X]
henrietta-was_OOV)))) ([X] közben)) ([X] ([X] a) ([X] magas ([X]
redshifts_OOV)) és)) ([X] arnold)) ([X] ([X] csapodárságaival) ([X]


doris) .)))"
at
java.lang.NumberFormatException.forInputString(NumberFormatException.java:
65)
at java.lang.Integer.parseInt(Integer.java:470)
at java.lang.Integer.parseInt(Integer.java:514)
at
joshua.ui.tree_visualizer.browser.TranslationInfoList.addNBestFile(TranslationInfoList.java:
109)
at joshua.ui.tree_visualizer.browser.Browser.main(Browser.java:100)


Laci

On márc. 18, 14:21, zhifei li <zhifei.w...@gmail.com> wrote:
> Can you try to set include_align_index=false?
>
> Cheers
> Zhifei
>

> > joshua_technic...@googlegroups.com<joshua_technical%2Bunsu...@googlegroups.com>

Jonny

unread,
Mar 18, 2010, 11:50:40 AM3/18/10
to Joshua technical support
Hi,

Are you post-processing the Joshua output file in any way? The
visualizer expects the normal Joshua nbest output which looks like
this:

## ||| sentence ||| feat. functions ||| score

where ## is the number of the input sentence (i.e. 0 for the first
sentence of the input file, 1 for the next sentence and so on).

It is throwing the number format exception because, when it tries to
split the line on the ||| separator, your sentence output is the first
thing on the line, instead of a sentence number.

Jonny.

> > > joshua_technic...@googlegroups.com<joshua_technical%2Bunsubscrib e...@googlegroups.com>

László Laki

unread,
Mar 18, 2010, 1:02:07 PM3/18/10
to joshua_t...@googlegroups.com
Hi,

Thanks everyone it works!

Laci

Chris Callison-Burch

unread,
Mar 18, 2010, 2:37:05 PM3/18/10
to joshua_t...@googlegroups.com
Can you say what you changed to get it to work?

--C

Laci

unread,
Mar 23, 2010, 5:52:12 AM3/23/10
to Joshua technical support
Hi

I just used, that Jonny told: that format of nbest output.

## ||| sentence ||| feat. functions ||| score

That solved my problem.

Laci

Nguyen Le Minh

unread,
May 9, 2010, 10:35:10 PM5/9/10
to joshua_t...@googlegroups.com
Dear All,
I found in joshua have discriminate re-ranking with CRFs and others.
I look at the HGDiscriminateLearner and found the code as follows.

if(args.length<11){
            System.out.println("wrong command, correct command should be: java Perceptron_HG is_crf lf_train_items lf_train_rules lf_orc_items lf_orc_rules f_l_num_sents f_data_sel f_model_out_prefix use_tm_feat use_lm_feat use_edge_bigram_feat_only f_feature_set use_joint_tm_lm_feature");
            System.out.println("num of args is "+ args.length);
            for(int i=0; i <args.length; i++)System.out.println("arg is: " + args[i]);
            System.exit(0);        
        }
        long start_time = System.currentTimeMillis();
        SymbolTable symbolTbl = new BuildinSymbol(null);    
        boolean is_using_crf =  new Boolean(args[0].trim());
        HGDiscriminativeLearner.usingCRF=is_using_crf;
        String f_l_train_items=args[1].trim();
        String f_l_train_rules=args[2].trim();
        String f_l_orc_items=args[3].trim();
        String f_l_orc_rules=args[4].trim();
        String f_l_num_sents=args[5].trim();

....

There are 11 parameters, my question is how to get them? DO you have any example for explaining this code?

Best regards,
Nguyen

zhifei li

unread,
May 10, 2010, 10:44:59 AM5/10/10
to joshua_t...@googlegroups.com
Hi Nguyen,

The CRF is not fully functional yet.
Sorry for the confusion.
I will remove the code before it is actually functional.

But, Joshua does have the hypergraph-based minimum risk training, which
is described in the following EMNLP paper.

Zhifei Li and Jason Eisner.
First- and Second-order Expectation Semirings with Applications to Minimum-Risk Training on Translation Forests.
In Proceedings of EMNLP 2009.


We will provide some instructions very soon.

Cheers
Zhifei

Nguyen Le Minh

unread,
May 10, 2010, 10:58:28 PM5/10/10
to joshua_t...@googlegroups.com
Dear Zhifei,
Thank you for your useful information.
I am looking forward to your instruction.
Best regards,
Nguyen

Nguyen Le Minh

unread,
May 12, 2010, 1:15:32 AM5/12/10
to joshua_t...@googlegroups.com
Dear  Zhifei Li,
Thank you very much for your help.
Since I am hurry in using it, could you please send me some short instructions of using hypergraph-based minimum risk training.
Best regards,
Nguyen



On Mon, May 10, 2010 at 11:44 PM, zhifei li <zhife...@gmail.com> wrote:

Zhang Jiajun

unread,
May 12, 2010, 5:04:21 AM5/12/10
to joshua_t...@googlegroups.com
Hi, Zhifei,
Is there any option in Joshua that could enable the decoder output all the rules resulting the best translation?

zhifei li

unread,
May 12, 2010, 2:47:47 PM5/12/10
to joshua_t...@googlegroups.com
Hi Nguyen,

I just started a new job and so I may not have time to address this request.
But Ziyuan Wang (wzy.p...@gmail.com), a student at JHU, will send out some instructions soon.
Thanks.

Cheers
Zhifei

zhifei li

unread,
May 12, 2010, 2:50:17 PM5/12/10
to joshua_t...@googlegroups.com
Hi Jiajun,

Yes, you can do this by setting use_tree_nbest=true in your config file.
This will generate nbest of trees, instead of strings. Since Hiero is a SCFG grammar, the tree tells exactly how the string got derived
from different rules.


Cheers
Zhifei

Zhang Jiajun

unread,
May 12, 2010, 9:00:36 PM5/12/10
to joshua_t...@googlegroups.com
Hi Zhifei,

Thanks for your valuable time. I will try that. Wish you a happy work.

Best regards,
Jiajun Zhang

Nguyen Le Minh

unread,
May 12, 2010, 9:30:31 PM5/12/10
to joshua_t...@googlegroups.com
Dear Zhifei,
Thank you for your help.
Best regards,
Nguyen
Reply all
Reply to author
Forward
0 new messages