final reminder: today at 4pm, Christer Samuelsson

3 views
Skip to first unread message

Zuidema, Jelle

unread,
Feb 1, 2012, 9:39:04 AM2/1/12
to complin...@googlegroups.com
CLS, 1 February 2012, 4pm, room A1.04.

Speaker: Christer Samuelsson
Title: Parametric Distributions for Statistical Machine Translation: What do Alignment Probabilities Actually Look Like?

Abstract:

The alignment and sentence length probabilities statistical machine translation have numerical domains and could potentially be modeled by parametric distributions. It turns out that sentence length probabilities are well-described by automatically fitted Gaussian distributions, and alignment probabilities by reflected--crammed into a finite sentence---Cauchy distributions.

This indicates that there is less signal in them than commonly thought. From a practical perspective, parametric distributions are much more compact and robust. The extracted distribution parameters lent themselves very well to linear regression, compressing the entire set of distributions to just two regression parameters each. This is much more effective data pooling than smoothing over neighboring contexts.


Reply all
Reply to author
Forward
0 new messages