Moving forward with Serialization.

155 views
Skip to first unread message

Tom Christie

unread,
Aug 28, 2012, 11:52:26 AM8/28/12
to django-d...@googlegroups.com
Hi all,

  During Piotr's GSOC addressing customizable serialization I've been continued to work on my own implementation.
I've now got what feels to me to be a natural conclusion of the design, which is the 'forms' branch of the 'django-serializers' project.

There's still a few small bits and pieces that need filling in, but it's pretty much there,
and I believe the project essentially meets the requirements of a more customizable serialization API.

I won't go into the full documentation here (refer to the project itself if you're interested), but in summary it gives you:

* A declarative serialization API with Serializer/ModelSerializer classes that mirror the existing Form/ModelForm API.
* A backwards compatible FixtureSerializer class.
* Support for nested relationships, pk relationships, natural key relationships, and custom relationships such as hyperlinking.
* Validation for deserialization of objects, similar to Form validation.
* Explicitly decouples the structural concerns of serialization from the encoding/decoding concerns.
* Passing Django's existing fixture serialization tests.
* Fairly trivial to port into Django while keeping backwards compatibility with existing serialization API.

What I particularly like about the approach is how it mirrors the existing Forms API, and I think it'd go a long way to making Django a more useable framework for authors of Web APIs.  As an example, here's what it looks like when using django-serializers to write API views.

At this point what I'm looking for is some feedback:

* Would it be worth me trying to push towards getting this into core.
* Are there any fundamental issues with the API, or use cases it clearly doesn't address.
* Any suggested next steps, or things that would need to be addressed/clarified before it could considered it for core.

It's probably also worth noting that I'm now planning on using the project inside my personal project, django-rest-framework, in case that's relevant to the discussion.

Thoughts?

  Tom

Dirley

unread,
Aug 28, 2012, 2:31:04 PM8/28/12
to django-d...@googlegroups.com
Do you know Colander?


  Tom

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To view this discussion on the web visit https://groups.google.com/d/msg/django-developers/-/3ypJSF7nN8oJ.
To post to this group, send email to django-d...@googlegroups.com.
To unsubscribe from this group, send email to django-develop...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.

schinckel

unread,
Aug 29, 2012, 1:27:35 AM8/29/12
to django-d...@googlegroups.com
Hi Tom,

I've knocked around ideas of my own (internally, and on #django) related to serialisation: it's something I've had lots to think about, since until recently, most of my django work was in JSON apis.

I personally think that Forms are already the place that should handle (de)serialisation. They already serialise to HTML: why should they not be able to serialise to other stream types?

This is the approach I've started to use for my API generation code. They already have a declarative nature, and then you get all of the form validation on incoming data: a big improvement over how I've done it in the past.

(I've done some work on a form-based API generation framework: https://bitbucket.org/schinckel/django-repose. Whilst this is in use, it's still not really feature complete).

Matt.

schinckel

unread,
Aug 29, 2012, 1:28:26 AM8/29/12
to django-d...@googlegroups.com
Not sure that I phrased the last bit right: I think my repose framework is a step in the right direction, but even so I think it probably doesn't do things the right way.

Matt.

Tom Christie

unread,
Aug 31, 2012, 4:25:02 AM8/31/12
to django-d...@googlegroups.com
> Do you know Colander?

I do now.  Thanks, that's very similar, and looks well done.


> I personally think that Forms are already the place that should handle (de)serialisation. They already serialise to HTML: why should they not be able to serialise to other stream types?

Conceptually I agree.  As it happens django-serializers is perfectly capable of rendering into HTML forms, I just haven't yet gotten around to writing a form renderer, since it was out-of-scope of the fixture serialization functionality.

Pragmatically, I'm not convinced it'd work very well.  The existing Forms implementation is tightly coupled to form-data input and HTML output, and I think trying to address that without breaking backwards compatibility would be rather difficult.  It's maybe easy enough to do for flat representations, and pk relationships, but extending it to deal with nested representations, being able to use a Form as a field on another Form, and representing custom relationships would all take some serious hacking.  My personal opinion is that whatever benefits you'd gain in DRYness, you'd lose in code complexity.  Having said that, if someone was able to hack together a Forms-based fixture serialization/deserialization implementation that passes the Django test suite, and didn't look too kludgy, I'd be perfectly willing to revise my opinion. 

There's also some subtle differences between serializer fields, and Django's existing form fields.  Because form fields only handle form input, incoming fields can never be null, only blank or not blank.   With other representations such as JSON, that's not the case, so for serializer fields, the blank=True/False null=True/False style is appropriate, whereas for form fields the required=True/False style is appropriate.

I'm also wary of getting bogged down in high level 'wouldn't it be nice if...' conversations.  With just a little bit of work, the django-serializers implementation could be turned into a pull request that'd replace the existing fixture serialization with something much more useful and flexible.  What I'm really looking for is some feedback on if it'd be worth my time.

Regards,

  Tom

Piotr Grabowski

unread,
Sep 1, 2012, 7:41:55 AM9/1/12
to django-d...@googlegroups.com
W dniu 31.08.2012 10:25, Tom Christie pisze:
> > I personally think that Forms are already the place that should
> handle (de)serialisation. They already serialise to HTML: why should
> they not be able to serialise to other stream types?
>
> Conceptually I agree. As it happens django-serializers is perfectly
> capable of rendering into HTML forms, I just haven't yet gotten around
> to writing a form renderer, since it was out-of-scope of the fixture
> serialization functionality.
>
> Pragmatically, I'm not convinced it'd work very well. The existing
> Forms implementation is tightly coupled to form-data input and HTML
> output, and I think trying to address that without breaking
> backwards compatibility would be rather difficult. It's maybe easy
> enough to do for flat representations, and pk relationships, but
> extending it to deal with nested representations, being able to use a
> Form as a field on another Form, and representing custom relationships
> would all take some serious hacking. My personal opinion is that
> whatever benefits you'd gain in DRYness, you'd lose in code
> complexity. Having said that, if someone was able to hack together a
> Forms-based fixture serialization/deserialization implementation that
> passes the Django test suite, and didn't look too kludgy, I'd be
> perfectly willing to revise my opinion.

I am not quite sure but I think Forms should be build based on some
serialization API not opposite. Forms are more precise way of models
serialization - they are models serialized to html (specific format)
with some validation (specific actions) when deserializing.


I like Tom's django-serialziers but there are some things that I want to
mention:

* Process of serialization is split to two parts - transformation to
python native datatype (serializer) and next to specific text format
(renderer). But when serializing also Field is saved with data so it's
not so clean. I also have an issues with this but I resolve it in
different way (not say better :)

* In master branch Serializer is closely tied to Renderer so if there is
different Renderer class than new Serializer is needed. In forms branch
it is done in __init__ serialize method and this must be rewrite for
backward compatibility if django-serializers goes to core. I want to
propose my solution [1]:
For each format there is Serializer class which is made from
NativeSerializer ( from models to python native datatype) and
FormatSerializer (Renderer)

class Serializer(object):
# class for native python serialization/deserialization
SerializerClass = NativeSerializer
# class for specific format serialization/deserialization
RendererClass = FormatSerializer

def serialize(self, queryset, **options):

def deserialize(self, stream_or_string, **options):

Deserializer = Serializer

This is fully backward compatible and user can do:
serializers.serialize('registered_format', objects,
serializer=MyNativeSerializer)

This will make new Serializer class with SerializerClass ==
MyNativeSerializer. In this solution NativeSerializer and
FormatSerializer are more independent. In my solution each
NativeSerializer can be render by each FormatSerializer but it's not so
simple. FormatSerializer provide NativeSerializer with some context so
you can tell that NativeSerializer knows what format will be serialized.
It's not exactly format but some metadata about it. I am not proud of
this :/

* IMO there is bug related to xml. All model fields must be transform to
text before xml serialization. In current django serialization framework
field's method value_to_string is responsible for this. In
django-serializers this method is not always called so it can lead to
error with custom model field

[1]
https://github.com/grapo/django/tree/soc2012-serialization/django/core/serializers

--
Piotr Grabowski

Tom Christie

unread,
Sep 6, 2012, 8:02:35 AM9/6/12
to django-d...@googlegroups.com
Thanks Piotr,


> But when serializing also Field is saved with data so it's not so clean.

I'm not sure whether it's an issue or not.  As with Django's form fields, the fields are copied off of base_fields, so saving state isn't necessarily a problem.  I'd agree that it's possibly not the cleanest part of the API, but it might be good enough.

> For each format there is Serializer class which is made from NativeSerializer ... and FormatSerializer

That would work.
I was planning on taking a slightly different approach in order to merge the work into Django.
In my opinion the `SERIALIZATION_MODULES` setting wouldn't make much sense once serialization and parsing/rendering are decoupled.  Furthermore it places some very specific, slightly odd, and undocumented constraints on the interface that serialization modules must provide.  Eg.  They must include a class named 'Serializer' and a class named 'Deserializer'.

My preference would be to put that setting on the road to deprecation, replacing it with something more appropriate, eg:

SERIALIZER_CLASS='django.core.serializers.FixtureSerializer'
RENDERER_CLASSES=(...)
PARSER_CLASSES=(...)

It'd be possible to do that in a way that'd be consistent with Django's deprecation policy, if the initial release continued to check if SERIALIZATION_MODULES is set, and use that in preference to the new style settings.

The `serialize`, `deserialize` and `get_serializer` functions continue to exist and would be light backward-compatible wrappers around the core of the new-style serialization.
 
Having said that I could probably live with either approach.

I guess it would probably make sense for me to pull django-serializers into a fork of django at some point, to better demonstrate what I'm suggesting.

> * IMO there is bug related to xml. All model fields must be transform to 
> text before xml serialization. In current django serialization framework 
> field's method value_to_string is responsible for this.

Thanks.  I've not looked into that yet.   It sounds feasible, though I think it'd have to only be affecting custom model fields in xml, and should be easy enough to fix.  If it's easy for you to provide an example case where django-serializer's output differs from the existing fixtures it'd be much appreciated.

Regards,

  Tom
Reply all
Reply to author
Forward
0 new messages