[nltk-issues] Issue 550 in nltk: Unable to pickle nltk.model.ngram.NGramModel

42 views
Skip to first unread message

nl...@googlecode.com

unread,
May 10, 2010, 8:44:43 AM5/10/10
to nltk-...@googlegroups.com
Status: New
Owner: ----
Labels: Type-Defect Priority-Medium

New issue 550 by SimonJGr...@gmail.com: Unable to pickle
nltk.model.ngram.NGramModel
http://code.google.com/p/nltk/issues/detail?id=550

What version of NLTK are you using? 2.0b8


What steps will reproduce the problem?

>>> import pickle
>>> import nltk

>>> n = nltk.model.ngram.NgramModel(3, ['a', 'b', 'c'])

>>> print pickle.dumps(n)

What is the expected output? What do you see instead?

It would be very useful to be able to pickle an NgramModel (e.g. when it's
been trained on a lot
of data). Unfortunately, the use of the lambda function makes pickle
impossible:

pickle.PicklingError: Can't pickle <function <lambda> at 0x249ca30>: it's
not found as
nltk.model.ngram.<lambda>

The attached patch refactors the lambda into an external function to allow
the NgramModel
object to be pickled happily.

Attachments:
ngram_pickle.diff 842 bytes

--
You received this message because you are subscribed to the Google Groups "nltk-issues" group.
To post to this group, send email to nltk-...@googlegroups.com.
To unsubscribe from this group, send email to nltk-issues...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/nltk-issues?hl=en.

nl...@googlecode.com

unread,
May 10, 2010, 8:56:42 AM5/10/10
to nltk-...@googlegroups.com
Updates:
Status: Fixed

Comment #1 on issue 550 by StevenBird1: Unable to pickle
nltk.model.ngram.NGramModel
http://code.google.com/p/nltk/issues/detail?id=550

This issue was closed by revision r8548.

nl...@googlecode.com

unread,
May 10, 2010, 9:00:51 AM5/10/10
to nltk-...@googlegroups.com

Comment #2 on issue 550 by StevenBird1: Unable to pickle
nltk.model.ngram.NGramModel
http://code.google.com/p/nltk/issues/detail?id=550

Thanks for the fix.

nl...@googlecode.com

unread,
Aug 8, 2010, 9:51:27 AM8/8/10
to nltk-...@googlegroups.com

Comment #3 on issue 550 by cooldwind: Unable to pickle
nltk.model.ngram.NGramModel
http://code.google.com/p/nltk/issues/detail?id=550

Problem persist when using a custom estimator. For example:


def estimator(fdist, bins):
return LidstoneProbDist(fdist, 0.2)

lm = nltk.model.ngram.NgramModel(2, 'test', estimator)
s = pickle.dumps(lm)


Results in:

/usr/lib/python2.6/pickle.pyc in dumps(obj, protocol)
1364 def dumps(obj, protocol=None):
1365 file = StringIO()
-> 1366 Pickler(file, protocol).dump(obj)
1367 return file.getvalue()
1368

/usr/lib/python2.6/pickle.pyc in dump(self, obj)
222 if self.proto >= 2:
223 self.write(PROTO + chr(self.proto))
--> 224 self.save(obj)
225 self.write(STOP)
226

/usr/lib/python2.6/pickle.pyc in save(self, obj)
329
330 # Save the reduce() output and finally memoize the object

--> 331 self.save_reduce(obj=obj, *rv)
332
333 def persistent_id(self, obj):

/usr/lib/python2.6/pickle.pyc in save_reduce(self, func, args, state,
listitems, dictitems, obj)
417
418 if state is not None:
--> 419 save(state)
420 write(BUILD)
421

/usr/lib/python2.6/pickle.pyc in save(self, obj)
284 f = self.dispatch.get(t)
285 if f:
--> 286 f(self, obj) # Call unbound method with explicit self
287 return
288

/usr/lib/python2.6/pickle.pyc in save_dict(self, obj)
647
648 self.memoize(obj)
--> 649 self._batch_setitems(obj.iteritems())
650
651 dispatch[DictionaryType] = save_dict

/usr/lib/python2.6/pickle.pyc in _batch_setitems(self, items)
661 for k, v in items:
662 save(k)
--> 663 save(v)
664 write(SETITEM)
665 return

/usr/lib/python2.6/pickle.pyc in save(self, obj)
329
330 # Save the reduce() output and finally memoize the object

--> 331 self.save_reduce(obj=obj, *rv)
332
333 def persistent_id(self, obj):

/usr/lib/python2.6/pickle.pyc in save_reduce(self, func, args, state,
listitems, dictitems, obj)
417
418 if state is not None:
--> 419 save(state)
420 write(BUILD)
421

/usr/lib/python2.6/pickle.pyc in save(self, obj)
284 f = self.dispatch.get(t)
285 if f:
--> 286 f(self, obj) # Call unbound method with explicit self
287 return
288

/usr/lib/python2.6/pickle.pyc in save_dict(self, obj)
647
648 self.memoize(obj)
--> 649 self._batch_setitems(obj.iteritems())
650
651 dispatch[DictionaryType] = save_dict

/usr/lib/python2.6/pickle.pyc in _batch_setitems(self, items)
661 for k, v in items:
662 save(k)
--> 663 save(v)
664 write(SETITEM)
665 return

/usr/lib/python2.6/pickle.pyc in save(self, obj)
284 f = self.dispatch.get(t)
285 if f:
--> 286 f(self, obj) # Call unbound method with explicit self
287 return
288

/usr/lib/python2.6/pickle.pyc in save_global(self, obj, name, pack)
746 raise PicklingError(
747 "Can't pickle %r: it's not found as %s.%s" %
--> 748 (obj, module, name))
749 else:
750 if klass is not obj:

PicklingError: Can't pickle <function estimator at 0xa191c08>: it's not
found as __main__.estimator

nl...@googlecode.com

unread,
Apr 6, 2013, 9:46:36 AM4/6/13
to nltk-...@googlegroups.com

Comment #4 on issue 550 by MOLiSoft...@gmail.com: Unable to pickle
nltk.model.ngram.NGramModel
http://code.google.com/p/nltk/issues/detail?id=550

Same issue for me.
How can I fix it? How to use ngram_pickle.diff file?
Thanks.
MOLi

--
You received this message because this project is configured to send all
issue notifications to this address.
You may adjust your notification preferences at:
https://code.google.com/hosting/settings
Reply all
Reply to author
Forward
0 new messages