Developer help needed on resolving: TypeError: 'LazyCorpusLoader' object is not iterable [was Re: [nltk-users] Re: Code Questions on Chapter 6]

2,623 views

Skip to first unread message

Richard Careaga

unread,

Apr 8, 2010, 10:00:42 PM4/8/10

to nltk-...@googlegroups.com

Yup, that's what I get, too:

>>> featuresets = [(gender_features2(n), g) for (n,g) in names]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'LazyCorpusLoader' object is not iterable

In looking at the error message:

TypeError: 'LazyCorpusLoader' object is not iterable

I look down on the back of my hand, where is tattooed "Everything in Python is an object."

Then, where does the LazyCorpusLoader object come from? I don't remember seeing it.

All objects currently in your namespace can be inspected:

>>> dir()
['_[1]', '__builtins__', '__doc__', '__name__', '__package__', 'gender_features2', 'names', 'nltk', 'pprint', 're']

Nope, not here.

But, and this one is only a yellow sticky on the monitor, "objects can contain other objects." So which one of these is it?

By its name, it comes from nltk, probably. If we fire up a new Python session, we get

>>> dir()
['__builtins__', '__doc__', '__name__', '__package__']
>>>

so we can eliminate those.

We didn't put it in gender_features2, so that leaves nltk, pprint and re.

re we know to be a standard package, so provisionally we can discard it, along with pprint.

_[1] is sort of mysterious and we know nltk is big, so let's check names

>>> type(names)
<class 'nltk.corpus.util.LazyCorpusLoader'>

Bingo.

Now, let's see what help we can get in finding out what properties this class has (like whether you can iterate over it)

>>> help(nltk.corpus.util.LazyCorpusLoader)

where we see:

| __getattr__(self, attr)
|
| __init__(self, name, reader_cls, *args, **kwargs)
|
| __repr__(self)

Hmm. What are the attributes of this class, that might include something iterable?

>>> names.__getattr__
<bound method LazyCorpusLoader.__getattr__ of <WordListCorpusReader in '.../corpora/names' (not loaded yet)>>

Oh, well. Sure. Thanks a lot, another mystery!

We could take another tack, which is going into the nltk library and tracking this beast into its deepest lair, but at this point, it's best to throw up our hands in surrender and ask the experts, having ourselves made a reasonable effort to help ourselves.

Terry Shen wrote:

>>> from nltk.corpus import names

import nltk, re, pprint

>>> def gender_features2(name):

    features = {}
    features["firstletter"] = name[0].lower()
    features["lastletter"] = name[-1].lower()
    for letter in 'abcdefghijklmnopqrstuvwxyz':
        features["count(%s)" % letter] = name.lower().count(letter)
        features["has(%s)" % letter] = (letter in name.lower())
    return features

Steven Bird

unread,

Apr 10, 2010, 3:16:56 AM4/10/10

to nltk-...@googlegroups.com

Thanks for thoroughly demonstrating this problem with our documentation. There needs to be a way to discover the corpus reader methods from a corpus loader object. Would someone like to submit this to the issue tracker?

In the meantime, to see how to use list-like corpora, please see:

http://nltk.googlecode.com/svn/trunk/doc/howto/corpus.html#word-lists-and-lexicons

>>> from nltk.corpus import names

import nltk, re, pprint

>>> def gender_features2(name):

    features = {}
    features["firstletter"] = name[0].lower()
    features["lastletter"] = name[-1].lower()
    for letter in 'abcdefghijklmnopqrstuvwxyz':
        features["count(%s)" % letter] = name.lower().count(letter)
        features["has(%s)" % letter] = (letter in name.lower())
    return features

--
You received this message because you are subscribed to the Google Groups "nltk-users" group.
To post to this group, send email to nltk-...@googlegroups.com.
To unsubscribe from this group, send email to nltk-users+...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/nltk-users?hl=en.

Reply all

Reply to author

Forward

0 new messages