Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Finding word lemmas
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  3 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Morten Minde Neergaard  
View profile  
 More options Jan 29 2012, 2:35 pm
From: Morten Minde Neergaard <x...@8d.no>
Date: Sun, 29 Jan 2012 20:35:41 +0100
Local: Sun, Jan 29 2012 2:35 pm
Subject: Finding word lemmas
Hi,

I've been looking into looking up single words in a dictionary as part
of a project. To allow input in non-canonical form, I'm using the method
nltk.corpus.wordnet.morphy. This works very well for English.

For other languages, the stemmers I've found in NLTK return forms
unsuitable for dictionary lookup, e.g. in Spanish:

 nltk.stem.snowball.SpanishStemmer().stem('logicamente')
 u'logic'

Does anyone have a good idea for how to handle this? A worst case
scenario would be using lemmatizers not written in python, but I'd like
to avoid this.

Cheers,
--
Morten Minde Neergaard


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Alex Rudnick  
View profile  
 More options Jan 29 2012, 2:47 pm
From: Alex Rudnick <alex.rudn...@gmail.com>
Date: Sun, 29 Jan 2012 14:47:29 -0500
Local: Sun, Jan 29 2012 2:47 pm
Subject: Re: [nltk-users] Finding word lemmas
Hey Morten,

Morphological analysis is kind of a hard problem for many languages!
You may have to find a language-specific tool in a lot of cases, and
many of them may not be in Python.

But if you want to do Spanish, Mike Gasser (my advisor) has some
Python 3 software that works pretty well for Spanish verbs. In many
cases (don't know the precision/recall), it will find the infinitive
form of a verb, given the conjugated form. There's also morphological
analyzers for a few other languages here:

http://www.cs.indiana.edu/~gasser/Research/software.html

On Sun, Jan 29, 2012 at 2:35 PM, Morten Minde Neergaard <x...@8d.no> wrote:

Hope this helps!

--
-- alexr


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Morten Minde Neergaard  
View profile  
 More options Feb 8 2012, 2:09 pm
From: Morten Minde Neergaard <x...@8d.no>
Date: Wed, 8 Feb 2012 20:09:42 +0100
Local: Wed, Feb 8 2012 2:09 pm
Subject: Re: [nltk-users] Finding word lemmas
At 14:47, Sun 2012-01-29, Alex Rudnick wrote:

> Hey Morten,

Hi! Sorry that I didn't remember to say “thank you” right away =)

> Morphological analysis is kind of a hard problem for many languages!
> You may have to find a language-specific tool in a lot of cases, and
> many of them may not be in Python.

Indeed, and lots of the tools are closed source and/or have rotten code
bases.

> But if you want to do Spanish, Mike Gasser (my advisor) has some
> Python 3 software that works pretty well for Spanish verbs. In many
> cases (don't know the precision/recall), it will find the infinitive
> form of a verb, given the conjugated form. There's also morphological
> analyzers for a few other languages here:

In case anyone cares, I ended up writing a small wrapper around
TreeTagger. It's giving me good results.
http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/DecisionT...

--
Morten Minde Neergaard


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »