R
eminder:
The next Computational Linguistics Seminar will be tomorrow, February 15th, 4-6pm in B0.207.
Our very own Barend Beekhuizen will present work in progress under the headingUnsupervised Parsing and Model Merging: The Benefits of Starting Big.
Abstract attached/below
-------- Original message --------
Subject: Re: [CLS] 15/2, 4pm: Barend Beekhuizen
From: Barend Beekhuizen <
barendbe...@gmail.com>
To: "Zuidema, Jelle" <
W.H.Z...@uva.nl>
CC:
What does it mean for a language learner to learn by assuming all
subtrees the way it is done in U-DOP? In this talk, I present
arguments for the lack of cognitive realism of starting with all
conceivable binary subtrees and propose an alternative model, BMM-DOP,
that makes more restricted and conservative generalizations. The model
first induces the optimal minimal subtrees from flat data by means of
an adaptation of Bayesian Model Merging (BMM), and then applies an
all-subtrees approach, yielding superior results to either U-DOP (while using less rules) or
solely BMM on an unlabeled parsing task on the WSJ10 data. More
importantly, the model lets the learner start with big chunks, and
acquire more productive, abstract patterns only gradually, as I will
show using child-directed language data (Brown corpus- Eve), thus approaching more
closely the usage-based predictions of how grammatical acquisition
works.