> (1) The ALL unary rule handler: when we encounter a unary rule/chain
> in a parse, we can extract the label for the corresponding span in a
> number of ways. One of them is named ALL and, if I'm not mistaken,
> concatenates the unary chain into a combined nonterminal like so:
> PP:NP:NN. That seems iffy, and it also makes a part of my work hard to
> do. Did anyone ever use the ALL handler, and is it any BLEU-good? Can
> I drop it from Thrax?
I've never used this and it seems that standard practice is to collapse unary chains, since they're kind of bogus anyway. Can you retain the option that lets you choose the label from either the top or the bottom?
> (2) Numbers: doubles versus floats. Joshua decodes in floats, Thrax
> extracts in (mostly) doubles. Do we actually care to do that? We might
> be looking at quite a bit of added space savings if we consistently
> change to floats. Unless any of you have experience, I propose that
> once I get Thrax operational again we do a contrastive run of
> doubles-versus-floats to gauge the impact/loss to be expected in
> practice.
I'm happy to stick to floats. I think we even switched a while back to outputting "%.5f" (instead of the 10 or 20 digits you get when printing a double).
> Thoughts?
>
> -- Juri
>
> --
> You received this message because you are subscribed to the Google Groups "Joshua Developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
joshua_develop...@googlegroups.com.
> To post to this group, send email to
joshua_d...@googlegroups.com.
> Visit this group at
http://groups.google.com/group/joshua_developers?hl=en.
> For more options, visit
https://groups.google.com/groups/opt_out.
>