failure of wn.all_synsets('n') in wordnet

657 views
Skip to first unread message

typetoken

unread,
Jul 31, 2012, 10:38:36 PM7/31/12
to nltk-...@googlegroups.com
For exercises 13 in our nltk book,  ◑ What percentage of noun synsets have no hyponyms? You can get all noun synsets using wn.all_synsets('n')

First, I want to take a look at the content of wn.all_synsets('n'). It turned out that no ways to glimpse it. 

>>> from nltk.corpus import wordnet as wn
>>> nouns = wn.all_synsets('n')
>>> nouns
<generator object all_synsets at 0x06287FD0>
>>> nouns[:3]

Traceback (most recent call last):
  File "<pyshell#37>", line 1, in <module>
    nouns[:3]
TypeError: 'generator' object has no attribute '__getitem__'

Second, for this exercise, I need to know how many noun synset in wordnet. However, it won't work with len(). 

>>> len(nouns)

Traceback (most recent call last):
  File "<pyshell#34>", line 1, in <module>
    len(nouns)
TypeError: object of type 'generator' has no len()

Third, I need to know how many noun synset have hyponyms. However, it tells me zero as follows:

>>> len([synset for synset in nouns if synset.hyponyms() == True])
0

Could anyone help on this  exercise?

Thanks indeed for your kind hints.

Sincerely
Typetoken

typetoken

unread,
Jul 31, 2012, 10:44:36 PM7/31/12
to nltk-...@googlegroups.com
For the number of synset which has no hyponyms, the result is also 0. Quite odd, isn't it?

>>> len([synset for synset in nouns if synset.hyponyms() != True])
0

Jordan Boyd-Graber

unread,
Jul 31, 2012, 11:20:57 PM7/31/12
to nltk-...@googlegroups.com
It seems that you want to use a generator as a list. If that's what
you want to do, you can do something like:

nouns = list(wn.all_synsets('n'))

Note that the hyponyms function returns a list of synsets that are
another synset's hyponyms.

>>> entity.hyponyms()
[Synset('thing.n.08'), Synset('physical_entity.n.01'),
Synset('abstraction.n.06')]

So you'll need to adjust your condition.

Best,

Jordan
> --
> You received this message because you are subscribed to the Google Groups
> "nltk-users" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/nltk-users/-/k7YtWkkHvEIJ.
>
> To post to this group, send email to nltk-...@googlegroups.com.
> To unsubscribe from this group, send email to
> nltk-users+...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/nltk-users?hl=en.



--
--------------------
Jordan Boyd-Graber Ying

Voice: 920.524.9464
j...@umiacs.umd.edu
http://terpconnect.umd.edu/~jbg
--------------------

"In theory, there is no difference between theory and practice. But,
in practice, there is."
- Jan L.A. van de Snepscheut

John H. Li

unread,
Aug 1, 2012, 3:51:00 AM8/1/12
to nltk-...@googlegroups.com
Dear Jordan, 
Thanks much for your kind and prompt help. I modified it as follows:

>>> nouns = list(wn.all_synsets('n'))
>>> len(nouns)
82115
>>> nouns[:3]
[Synset('entity.n.01'), Synset('physical_entity.n.01'), Synset('abstraction.n.06')]
>>> len([synset for synset in nouns if synset.hyponyms()!= True])
82115

>>> len([synset for synset in nouns if len(synset.hyponyms())== 0])
65422
>>> 65422/82115
0
>>> float(65422/82115)
0.0
>>> from __future__ import division
>>> 65422/82115
0.7967119283931072
>>> len([synset for synset in nouns if synset.hyponyms()== True])
0
>>> len([synset for synset in nouns if len(wn.synset.hyponyms())== 0])

Traceback (most recent call last):
  File "<pyshell#106>", line 1, in <module>
    len([synset for synset in nouns if len(wn.synset.hyponyms())== 0])
AttributeError: 'function' object has no attribute 'hyponyms'
>>> len([synset for synset in nouns if len(synset.hyponyms())== 0])
65422

Now it seems that it works now. However, in the process, I came across some questions which I really want to know the answer. Could you or anyone  help?


1) why can't the following codes work? if synset.hyponyms()!= True] , why can't this be used to check if the synset.hyponyms exist or not?

>>> len([synset for synset in nouns if synset.hyponyms()!= True])
82115

Why did the above code create the same number as len(nouns)?

Moreover, 

>>> len([synset for synset in nouns if synset.hyponyms()== True])
0

Can I use this codes to mean those synsets which have hyponyms? It seems that the number is 0. Why?

2) Why did float() only give one-digit number only in the following example:
>> float(65422/82115)
0.0
>>> from __future__ import division
>>> 65422/82115
0.7967119283931072


Thanks indeeds for your kind help and instructions.

Sincerely
Typetoken

daoud mohd

unread,
Aug 1, 2012, 4:04:23 AM8/1/12
to nltk-...@googlegroups.com
Dear sir i install python in my computer how to start nltk please send me quires 
--
With Best Regards
Mohd Daoud
IIT DELHI

John H. Li

unread,
Aug 1, 2012, 5:43:21 AM8/1/12
to nltk-...@googlegroups.com
Dear Sir Thanks for your kindness. If you are following and studying Natural Language Processing with Python  as your course book and as a beginner, page 4 chapter 1 tells you the answer to your enquiry. SIR!  I've eaten each page so far, just finished chapter 2, and  am ploughing alone through the exercises at the end of chapter 2. SIR! Therefore, many questions.

Jordan Boyd-Graber

unread,
Aug 1, 2012, 7:44:36 AM8/1/12
to nltk-...@googlegroups.com
These are all Python questions, not NLTK questions. I would encourage
you to go through a Python tutorial that will answer these questions.

>>> 1) why can't the following codes work? if synset.hyponyms()!= True] , why
>>> can't this be used to check if the synset.hyponyms exist or not?

This is because the hyponyms function returns a *list*. This list
will never be equal to True, which is a *boolean*. Python is perhaps
a little confusing on this point, as if you do:

if synset.hyponyms()

this will evaluate to False if the list is empty. This makes it
easier to write LISP-style recursive calls.

>>> 2) Why did float() only give one-digit number only in the following
>>> example:

Division under integers is closed. If you divided a float by an
integer, you would get a float.

float(7) / 3

This is confusing, which is why they are changing this in future
versions of Python, as you noted.

John H. Li

unread,
Aug 1, 2012, 9:19:50 AM8/1/12
to nltk-...@googlegroups.com
Thanks indeed for your kind clarification, encouragement and instructions. Owing to your help, I've successfully solved the problematic codes for exercises 13 and 14 in Chapter 2.

daoud mohd

unread,
Aug 3, 2012, 1:13:55 AM8/3/12
to nltk-...@googlegroups.com
Dear sir..

can we use java or other api used inplace Python ????

daoud mohd

unread,
Aug 3, 2012, 1:43:26 AM8/3/12
to nltk-...@googlegroups.com

winf...@gmail.com

unread,
Aug 3, 2012, 2:57:47 AM8/3/12
to nltk-...@googlegroups.com
Maybe you can use Jython to convert it into a jar file. Nltk is a library written in Python though.

Sent from my HTC

daoud mohd

unread,
Aug 3, 2012, 5:54:36 AM8/3/12
to nltk-...@googlegroups.com
hello
how we find the similarity between  words using wordnet ?

daoud mohd

unread,
Aug 3, 2012, 5:56:15 AM8/3/12
to nltk-...@googlegroups.com
Hi,
I have installed NLTK. I want to make use of WordNet using NLTK,
especially I am interested in finding synonyms of a word in the
WordNet.

Tim McNamara

unread,
Aug 3, 2012, 4:54:10 PM8/3/12
to nltk-...@googlegroups.com

Creating a list is quite a memory-intensive method of computing those results.

As you become more familiar with Python, generators can lead to very space efficient programmes.

Generally, if you are building a list to produce summary statistics, you are wasting lots of memory. Consider these two code examples:

>>> nouns = list(wn.all_synsets('n'))

>>> len([synset for synset in nouns if synset.hyponyms()!= True])
82115

>>> nouns = wn.all_synsets('n')
>>> len(n for n in nouns if not n.hyponyms())

You are also getting an AttributeError. Try something like this.

>>> len(n for n in nouns if hasattr('hyponyms', n) and not n.hyponyms())

Understanding generators is extremely useful. Search for "beazley generators" without quotes to find one of the best slide decks on them. I would do that myself, but I'm on my phone.

John H. Li

unread,
Aug 3, 2012, 8:05:36 PM8/3/12
to nltk-...@googlegroups.com

1) >>> nouns = wn.all_synsets('n')
>>> len(n for n in nouns if hasattr('hyponyms', n) and not n.hyponyms())

Traceback (most recent call last):
  File "<pyshell#7>", line 1, in <module>
    len(n for n in nouns if hasattr('hyponyms', n) and not n.hyponyms())
TypeError: object of type 'generator' has no len()
>>> 


2) >>> nouns = list(wn.all_synsets('n'))
>>> len(n for n in nouns if hasattr('hyponyms', n) and not n.hypohyms())

Traceback (most recent call last):
  File "<pyshell#4>", line 1, in <module>
    len(n for n in nouns if hasattr('hyponyms', n) and not n.hypohyms())
TypeError: object of type 'generator' has no len()


Tim McNamara

unread,
Aug 3, 2012, 9:02:48 PM8/3/12
to nltk-...@googlegroups.com

Oh that's right. Do this instead:

sum(1 for n in nouns ...)

That will give an equivalent value to len().

John H. Li

unread,
Aug 4, 2012, 5:32:53 AM8/4/12
to nltk-...@googlegroups.com
>>> sum(1 for n in nouns if hasattr('hyponyms',n)and not n.hyponyms())

Traceback (most recent call last):
  File "<pyshell#6>", line 1, in <module>
    sum(1 for n in nouns if hasattr('hyponyms',n)and not n.hyponyms())
  File "<pyshell#6>", line 1, in <genexpr>
    sum(1 for n in nouns if hasattr('hyponyms',n)and not n.hyponyms())
TypeError: hasattr(): attribute name must be string

daoud mohd

unread,
Aug 8, 2012, 7:34:28 AM8/8/12
to nltk-...@googlegroups.com
HELLO
can we access the Hirst and St-Onge api in nltk??
can we access the wordnet online in nltk with the help of lhttp://localhost:port/
Reply all
Reply to author
Forward
0 new messages