Test predictions

Christos Christodoulopoulos

unread,

Apr 13, 2012, 5:33:16 AM4/13/12

to wils-ch...@googlegroups.com

I'm submitting my test predictions for both PoS and Dependency induction.
(Note: due to memory limitations of the parser I couldn't run on the full Arabic corpus--instead I'm submitting the 10-sentence version predictions)

My system description follows:
In this work we investigate how dependency information can be incorporated into an unsupervised PoS induction system by inducing both the PoS tags and the dependencies using an iterated learning method. We use BMMM (Christodoulopoulos et al., 2011), a PoS induction system that allows us to easily incorporate the dependency features as multinomial distributions, and the original DMV model (Klein and Manning, 2004) for inducing the dependency structures with unsupervised PoS tags. The iterated learning method works as follows: we run the original BMMM PoS inducer (without dependency features), using the output as input to the DMV parser. We then re-train the PoS inducer using the induced dependencies as additional features, use the new PoS tags to retrain the parser, and so forth for 5 iterations.
Both systems are fully unsupervised; only raw text is used as input---the BMMM system also uses morphology segmentation features obtained from Morfessor (Creutz and Lagus, 2005).

[Christodoulopoulos et al., 2011] - A Bayesian mixture model for pos induction using multiple features, C. Christodoulopoulos, S. Goldwater and M. Steedman In Proc. EMNLP, 2011.
[Creutz and Lagus, 2005] - Inducing the morphological lexicon of a natural language from unannotated text, Ma, Creutz and K, Lagus. In Proc. AKRR, 2005.
[Klein and Manning, 2004] - Corpus-based induction of syntactic structure: Models of dependency and constituency, D. Klein and C. Manning. In Proc. ACL, 2004.

christodoulopoulos_predictions.tar.gz

Douwe H Gelling

unread,

Apr 13, 2012, 7:31:22 AM4/13/12

to wils-ch...@googlegroups.com

Hi Christos,

Thanks for your submission. I've taken a quick look at your data, and it seems mostly fine, but there's a lot of attachments to punctuation.
During evaluation we ignore punctuation, so we change dependencies to punctuation to the nearest ancestor that's not punctuation.
(So if a word is attached to punctuation, it will be re-attached to the parent of the punctuation if that is not punctuation itself, and so on.)
If you're fine with this, we can just leave it as is, but I thought I should inform you.

Douwe

Christos Christodoulopoulos

unread,

Apr 13, 2012, 7:47:26 AM4/13/12

to wils-ch...@googlegroups.com

Hi Douwe,

Thanks for noticing that. I wasn't sure whether to address this in my short description or not (we have a paper of this system under review for EMNLP where we discuss this issue).
In the paper we argue that the use of punctuation is vital for the PoS part of the system. We too noted the tendency of the DMV model to use the final punctuation mark as the root of the sentence and we proposed a systematic change in the gold-standard annotation, where we switched the root node position between the main verb (usually) and the final punctuation.
I think what you propose (if I understand correctly) will have a similar effect at least for the sentence-final punctuation. So, yes I'm fine with that.

Cheers,
Chris

Douwe H Gelling

unread,

Apr 13, 2012, 8:14:18 AM4/13/12

to wils-ch...@googlegroups.com

Ahh ok good to know. I think the removal of punctuation during evaluation does what you need it to:
Words attached to punctuation will be attached to whatever the punctuation is attached to instead,
which is done recursively. So if words are attached to sentence-final punctuation and it is attached to
the main verb, those words will be attached to the main verb instead.

Thanks for confirming this,

Douwe

Grzegorz Chrupala

unread,

Apr 13, 2012, 1:14:07 PM4/13/12

to wils-ch...@googlegroups.com

Hi,

Please find attached my predictions for the POS induction track. The files contain tags in both the coarse and fine-grained POS columns.

System description:

The unsupervised POS tagging method used in this submission consists
of two components. The first one is the word class induction approach
using Latent Dirichlet Allocation proposed in [1]. As one of the
outputs of this first stage we obtain for each word type a probability
distribution over classes. In the second stage we create a
hierarchical clustering of the word types: we use an agglomerative
clustering algorithm where the distance between clusters is defined as
the Jensen-Shannon divergence between the probability distributions
over classes associated with each cluster. When assigning POS tags, we
find the tree leaf most similar to the current word and use the prefix
of the path leading to this leaf as the tag. We tune the number of
classes and prefix length based on the development data, for both
coarse-grained and fine-grained POS inventories. Otherwise the system
is entirely unsupervised and uses no resources other than raw word
forms in the provided data files.

[1] Grzegorz Chrupała. 2011. Efficient induction of probabilistic word
classes with LDA. IJCNLP 2011.

chrupala-pos-test-predictions.tgz

David Mareček

unread,

Apr 14, 2012, 3:21:43 AM4/14/12

to wils-ch...@googlegroups.com

Hi,

I've submitted my system for the Dependency induction task.
Unfortunately, my data exceeds Google Groups 4MB limit, because I've
made the predictions for all types of pos tags (CPOS, POS, UPOS). So
I've sent my predictions only to Douwe.

Here is the system description:

Our approach is based on dependency model that consists of three
submodels (i) edge model (similar to P_CHOOSE in DMV), (ii) fertility
model (modeling number of children for a given head), and (iii)
reducibility model. Fertility model utilizes the observation that
fertility of function words (typically the most frequent words in the
corpus) is more determined than fertility of content (less frequent)
words. The reducibility is a feature of individual part-of-speech
tags. We compute it based on reducibility of words in a large
unannotated corpus (we used Wikipedia aticles). A word is reducible,
if the sentence after removing the word remains grammatically correct.
The grammaticality of such newly created sentences is tested by
searching for it in the corpus.

The inference itself was done on the test corpus using Gibbs sampling
method. Three hyperparameters were tuned on English Penn Treebank with
the fine-grained POS tags (5th column in the given CoNLL format). For
parsing other languages and for all types of tags (CPOS, POS, UPOS),
we used the same parameter setting.

The only additional data (not provided by organizers) are the
unannotated monolingual Wikipedia text. (They were automatically POS
tagged by TnT tagger trained on the provided corpora.)

Best,

David Marecek

Anders Søgaard

unread,

Apr 14, 2012, 6:28:46 AM4/14/12

to wils-ch...@googlegroups.com

Dear David and others,

I had the same problem and submitted to Trevor last night. I submitted two runs - parsing only. The run 'soegaard-norules' is the parser presented in [1] that does not make use of universal rules, no parameters changed. The parser does no training. Instead it uses random walks to rank words and a simple parsing algorithm to turn ranked words into trees. The second run - 'soegaard-rulesonly' - is really a strong baseline based on the universal rules in [2], as reformulated in [1] in terms of the Google tags. Again, no training, no parameter setting.

Best wishes,

Anders

[1] Søgaard, A. 2011. Unsupervised dependency parsing without training. Natural Language Engineering 18: 187-203.
[2] Naseem, T; et al. 2010. Using universal linguistic knowledge to guide grammar induction. EMNLP.

****************************
Anders Søgaard, Assoc.Prof.,
Center for Language Technology
University of Copenhagen
Njalsgade 140
DK-2300 Copenhagen S
****************************

Douwe H Gelling

unread,

Apr 20, 2012, 9:01:17 AM4/20/12

to wils-ch...@googlegroups.com

Christos,

I just noticed that your Arabic predictions are actually thresholded on maximum length of 20 in the test set - was this intentional?
We're doing evaluation with a cutoff of 15 words per sentence as well, I could include your Arabic results in that as well.

Douwe

On 13 April 2012 12:47, Christos Christodoulopoulos <chris...@ieee.org> wrote:

Christos Christodoulopoulos

unread,

Apr 20, 2012, 9:16:05 AM4/20/12

to wils-ch...@googlegroups.com

Hi Douwe,

I was playing with different cutoff thresholds during the experiments so I might have used the wrong files so this is a happy coincidence. But the predictions are still perfectly valid and yes I'd be glad if you could include me in the 15 words per sentence evaluation!