Hi,
I am trying to write a function that will that processes some text and will return a list of words that correspond to the POS for those words. I would like the list sorted too according to these words but all i'm getting is strange results.
The result output I'm aiming for as defined in my doctest is: ['[', ']', 'affection', 'austen', 'between', 'blessings', 'caresses', 'clever', 'consequence', 'daughters']
But the output I'm getting at the moment is: ['[', 'Emma', 'Jane', 'Austen', ']', 'VOLUME', 'Emma', 'Woodhouse', 'handsome', 'clever']
Where am i going wrong??
CODE:
def distinct_words_of_pos(text, pos):
tokens = nltk.word_tokenize(text)
all_POS = nltk.pos_tag(tokens, tagset="universal")
sorted_list = [ i[0] for i in all_POS if pos in i[1]]
return sorted_list
Thanks.
You received this message because you are subscribed to the Google Groups "nltk-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
.