Problem with Pattern Singularize

15 views
Skip to first unread message

Kevin Glover

unread,
Dec 6, 2014, 11:24:38 AM12/6/14
to pattern-f...@googlegroups.com
I am working on an application that sends a list of nouns, structured within phrase searches, to Wikipedia and returns the number of hits for that phrase. The nouns must be singular. I have three input options:

1. Pattern parses a text file and extracts the nouns.

2. The nouns are entered by the user.

3. The program reads a list of nouns from a file.

All three options have been tested with the same set of words, including: happiness, sadness, tennis.

In option 1, the text file is tagged by the parser, and correctly leaves all three test words unchanged. This is my code:

if tag == "NNS":
    noun = (singularize(noun, pos='NN', custom={})
    tag = "NN"

That did not work in the other two (untagged) input options, so the program applies the code:

noun = singularize(noun)

which works absolutely fine with common nouns such as table, chair and room, and indeed with happiness, but deletes the final 's' of the other two words, so:

sadness = sadnes

tennis = tenni

All three words are in the Pattern en_lexicon file and are correctly tagged NN. So why do these two words 'not work' when the others do?

Apologies if I am missing something blindingly obvious, but I have been struggling with this all week! Any words of wisdom would be very gratefully received.

Thank you.

Kevin
Reply all
Reply to author
Forward
0 new messages