Breaking change in encoding iterator object between jsonpickle-0.7 and jsonpickle-0.8 ?

70 views
Skip to first unread message

Dalik

unread,
Apr 9, 2015, 4:54:11 PM4/9/15
to jsonp...@googlegroups.com
Hello folks. Not sure if I should address the entire group or just module maintainers with this question, but I'm new to this community so please bare with me...

Just to establish the context - I'm trying to serialize a custom object representing a Histogram. It defines methods to access certain Histogram attributes (such as P50, P99, Average etc), but it also implements and iterator interface for itself which produces a stream Bucket objects representing all histogram buckets, which I generate on the fly from the Histogram's raw data. So in the end I can access Histogram's attributes to get a high-level stats, plus I can iterate over buckets if I need more details for analysis, plotting etc.

The issue I'm experiencing is the change in Pickler::_flatten_obj_instance() method (observing in jsonpickle-0.8 and up) which now has a special case for iterator objects which effectively iterates over the target object if that one implements iterator protocol and serializes resulting list instead (as in jsonpickle-0.7) just serializing the object dictionary. 

I'm sure there must be very good case for doing this and wouldn't mine to have some extra run-time data to be serialized along with the rest of my object, but the problem is that once iterator handling code gets executed the _flatten_obj_instance() just returns and it happens before the object's dictionary encoding code has a chance to be executed.So the rest of the object data including attributes is just get lost

At this point my question is whether anybody else have noticed this and what would be a workaround. At this point I will try to do 2 things:

1). Try to move iterator handling code down to let object dictionary to be serialized first (which may still need more work because from what I can tell both sections return as soon as they are executed)
2). Revert back to older jsonpickle-0.7 implementation to at least make my code work.

I will appreciate any feedback. Most likely I'm misusing the library and I hope there is a clear intended way for handling my case.

Please advise.

Marcin Tustin

unread,
Apr 9, 2015, 4:58:22 PM4/9/15
to jsonp...@googlegroups.com
This is concerning. Can you open an issue on GitHub with some test code demonstrating the issue?

The thing is that pickle protocol specifies a hierarchy of methods; so it's possible that json pickle was using one before that now is superseded by a higher priority method.

I can comment more concretely with code.
--
You received this message because you are subscribed to the Google Groups "jsonpickle" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jsonpickle+...@googlegroups.com.
To post to this group, send email to jsonp...@googlegroups.com.
Visit this group at http://groups.google.com/group/jsonpickle.
For more options, visit https://groups.google.com/d/optout.


--
Marcin Tustin
Tel: +44 (0) 7773 787 105 (UK)
       +1  917 553 3974 (US)


Dalik

unread,
Apr 10, 2015, 10:20:08 AM4/10/15
to jsonp...@googlegroups.com
Thanks for prompt feedback, Marcin. Let me put some simple code sample together and to open an issue. For a quick fix I just commented out 4 lines of code in pickler.py and the behavior have reverted back to normal dictionary serialization. 

One question though - what is exact use case(s) the iterator handler code is intended to cover? From what I can think of (at least when using jsonpickle in the context of object serialization) it may do more harm than good especially for the iterators generating very long (possibly endless?) list of large objects.

Thanks!
Armen


On Thursday, April 9, 2015 at 3:58:22 PM UTC-5, Marcin Tustin wrote:
This is concerning. Can you open an issue on GitHub with some test code demonstrating the issue?

The thing is that pickle protocol specifies a hierarchy of methods; so it's possible that json pickle was using one before that now is superseded by a higher priority method.

I can comment more concretely with code.

On Thursday, April 9, 2015, Dalik <dalal...@gmail.com> wrote:
Hello folks. Not sure if I should address the entire group or just module maintainers with this question, but I'm new to this community so please bare with me...

Just to establish the context - I'm trying to serialize a custom object representing a Histogram. It defines methods to access certain Histogram attributes (such as P50, P99, Average etc), but it also implements and iterator interface for itself which produces a stream Bucket objects representing all histogram buckets, which I generate on the fly from the Histogram's raw data. So in the end I can access Histogram's attributes to get a high-level stats, plus I can iterate over buckets if I need more details for analysis, plotting etc.

The issue I'm experiencing is the change in Pickler::_flatten_obj_instance() method (observing in jsonpickle-0.8 and up) which now has a special case for iterator objects which effectively iterates over the target object if that one implements iterator protocol and serializes resulting list instead (as in jsonpickle-0.7) just serializing the object dictionary. 

I'm sure there must be very good case for doing this and wouldn't mine to have some extra run-time data to be serialized along with the rest of my object, but the problem is that once iterator handling code gets executed the _flatten_obj_instance() just returns and it happens before the object's dictionary encoding code has a chance to be executed.So the rest of the object data including attributes is just get lost

At this point my question is whether anybody else have noticed this and what would be a workaround. At this point I will try to do 2 things:

1). Try to move iterator handling code down to let object dictionary to be serialized first (which may still need more work because from what I can tell both sections return as soon as they are executed)
2). Revert back to older jsonpickle-0.7 implementation to at least make my code work.

I will appreciate any feedback. Most likely I'm misusing the library and I hope there is a clear intended way for handling my case.

Please advise.

--
You received this message because you are subscribed to the Google Groups "jsonpickle" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jsonpickle+unsubscribe@googlegroups.com.

To post to this group, send email to jsonp...@googlegroups.com.
Visit this group at http://groups.google.com/group/jsonpickle.
For more options, visit https://groups.google.com/d/optout.

Dalik

unread,
Apr 10, 2015, 10:38:22 AM4/10/15
to jsonp...@googlegroups.com
Was trying to open an issue on GigHub but apparently the POST access is blocked from the location I was trying to do it, so here is a simple code representing the issue:

Consider the code below. The object of the NonIteratorClass will be represented correctly while IteratorClass's representation will miss any internal details. Moreover an instance of InfiniteIteratorClass will never be encoded (MemoryError after a very long delay):

import jsonpickle

class Element( object ):
    """Iterator element"""
    def __init__( self, i ):
        self.i = i

class NonIteratorClass( object ):
    """Plain class with __dict__"""
    def __init__( self ):
        self.my_data = [ 1,2,3,4,5 ]
    def print_my_data( self ):
        print self.my_data

class IteratorClass( NonIteratorClass ):
    """Class with __dict__ implementing basic Iterator protocol
    and producing a limited number of elements
    """
    def __init__( self ):
        super( IteratorClass, self ).__init__()
        self.current = -1
    def __len__( self ):
        return len( self.my_data )
    def __iter__( self ):
        return self
    def next( self ):
        try:
            self.current += 1
            return Element( self.my_data[ self.current ] )
        except IndexError:
            raise StopIteration

class InfiniteIteratorClass( IteratorClass ):
    """Class with __dict__ implementing basic Iterator protocol
    and producing infinite stream of elements
    """
    def __init__( self ):
        super( InfiniteIteratorClass, self ).__init__()
    def next( self ):
        return Element( 1 )

if __name__ == "__main__":
    nio = NonIteratorClass()
    io = IteratorClass()
    iio = InfiniteIteratorClass()
    print "Encoding NonIteratorClass instance:"
    print jsonpickle.encode( nio )
    print
    print "Encoding IteratorClass instance ( where is io.my_data ??? ):"
    print jsonpickle.encode( io )
    print
    print "Encoding InfiniteIteratorClass instance ( jsonpickle.encode() never returns with MemoryError eventually ):"
    print jsonpickle.encode( iio )

The output of the code is provided below:

Encoding NonIteratorClass instance:
{"py/object": "__main__.NonIteratorClass", "my_data": [1, 2, 3, 4, 5]}

Encoding IteratorClass instance ( where is io.my_data ??? ):
{"py/object": "__main__.IteratorClass", "py/iterator": [{"py/object": "__main__.Element", "i": 1}, {"py/object": "__main__.Element", "i": 2}, {"py/object": "__main__.Element", "i": 3}, {"py/object": "__main__.Element", "i": 4}, {"py/object": "__main__.Element", "i": 5}]}

Encoding InfiniteIteratorClass instance ( jsonpickle.encode() never returns with MemoryError eventually ):
Reply all
Reply to author
Forward
0 new messages