Serializing/validating an unknown JSON structure

1,641 views
Skip to first unread message

Kane

unread,
Jan 14, 2015, 11:04:31 PM1/14/15
to django-res...@googlegroups.com
My goal is to use Django REST framework to:
  • have a default JSON serializer which returns everything in the JSON (regardless of it's structure), provided it's valid.
  • optionally add specify validators for certain fields, which may include "key X exists in the JSON, and it is a list/dictionary" (which I also can't seem to figure out how to do, without making a serializer for the property of key X).
Motivation: we use dictionaries such as {"a": "blah1", "b": "blah2"} instead of lists of dictionaries such as [{"name": "a", "value": "blah1"},{"name": "b", "value": "blah2"}]. The simple reason that internally it's a lot easier for us to lookup values: in the first case I find the value for e.g. key "a", whereas in the second I have to loop through the list until I find the dictionary with name == "a", and then get it's "value" field. It's not too bad in this example, but with many keys/items, nested at multiple levels, using lists becomes unreasonable. (And no, none of these come from Django models.) That said, it doesn't seem to be compatible with django REST framework, mainly because I don't know the keys beforehand so I can't declare them as fields (though I do know what the structure that the values take). It also appears I can't use list serializers (i.e. many=True), as dictionaries aren't lists. 

Possible approaches: From what I've read/tried, I see three avenues:

  1. Dynamically modify fields: I've spent a long time with this by reading the dictionary to get all the keys and adding them all with the fields=() option. I thought I had it working, but it turned out it wasn't (it return null for all keys as soon as one key value wasn't serialized). I'd prefer to avoid this approach as it seems somewhat complex.
  2. This question about changing the list method. I prefer this method, but I wasn't able to get an example working.
  3. Construct something myself from the ground up. I may have to, but I imagine someone must have come across this same issue before ...
Of course, I may have missed something obvious -- in which case my pride may be a little hurt, but I'll be darn glad to have an answer!

So, could someone please enlighten me on the best way to solve this issue, or give me a nudge in the right direction?

Thanks,

Kane

Tom Christie

unread,
Jan 15, 2015, 4:46:48 AM1/15/15
to django-res...@googlegroups.com
Hi Kane,

  I've not fully understood the question, but if the default serializers aren't meeting your use case, you might consider dropping down to using `BaseSerializer`, which is a simple placeholder that allows you to use your own custom validation and output representation style, but still have everything work seemlessly with the generic views.


Also don't be afraid to simply override the view behavior explicitly. I think a lot of folks rely on the default generic views thinking that that's somehow the "correct" thing to do - in fact we find that explicitly written view code is typically more obvious and maintainable, and that the generic views are simply a useful shortcut to use occasionally. If you start writing the view code explicitly you may find that an issues with "how do I fit this custom style into the framework" mostly disappear - you no longer need to use the serializers at all if they're not a good fit, and can simply return the response data directly. eg. `return Response({'whatever': 'something})`

Hope that at least gives you something to consider.

All the best,

  Tom

Kane O'Donnell

unread,
Jan 15, 2015, 1:58:39 PM1/15/15
to django-res...@googlegroups.com
Thanks Tom,

I'll quickly give an example. Say I have data which I wish to validate which is a dictionary where the keys are the names of people, and the value is another dictionary with their attributes, i.e.

{"kane":  {"gender": "male", ....}, "tom": {"gender": "male", ...} }

I know that each of the attribute dictionaries (i.e. {"gender": ...} must take a specific form, but I don't know what the keys for my main dictionary will be (i.e. "kane" and "tom" in this case). This means I can't declare them explicitly as fields beforehand, since I don't know what names I'll be getting. Does that make more sense?

As you suggest, it may be easier to write something custom. Unfortunately, we're all just learning Django (moving from Flask, where we either validated within the views, or used WTForms), so we're not quite as confident in what the best approach should be. However, one lesson we did learn is that separating the logic for form validation helped us keep things tidy and standard, so we would like to develop some similar approach, and serializers seem to be it. Also, given we're not using any Django models at the moment (we're essentially writing an app that stores everything in a JSON on disk), we're probably not likely to use some of the standard approaches -- I've already written a base view class for easily dealing with getting/putting/etc aspects of this JSON.

Finally, the link you sent was quite helpeful: there's an example class there for handling arbitrary data -- hopefully this will work, and I'm surprised I missed it! I'll have a look to see if we can use this as a base class for essentially returning .init_data by default, and then build optional validators on top of it. Does that sound like a good idea to you?

Thanks,

Kane



--
You received this message because you are subscribed to a topic in the Google Groups "Django REST framework" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/django-rest-framework/zCyj4S8VJMc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to django-rest-fram...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Jani Tiainen

unread,
Jan 16, 2015, 5:32:18 AM1/16/15
to django-res...@googlegroups.com
Hi,

Your data (JSON) structure is not ideal for automated handling,
specially you should try to conform to key-value pairs where key is
pre-known so your data should look something like:

[ {"person" : "kane", "data" : {"gender": "male", ...}, {"person" :
"tom", data: {...}}, ... ]

(or you could put name within other data and just have:

[{"name":"kane", "gender":...}, ["name":"tom", "gender":...}]

Having key as a value is a bit problematic, it's not impossible to do
but as Tom said, you'll end up writing custom
(de)serialiser.

(sequence, object,
> >> - have a default JSON serializer which returns everything in
> >> the JSON (regardless of it's structure), provided it's valid.
> >> - optionally add specify validators for certain fields, which
> >> may include "key X exists in the JSON, and it is a
> >> list/dictionary" (which I also can't seem to figure out how to do,
> >> without making a serializer for the property of key X).
> >>
> >> *Motivation*: we use dictionaries such as {"a": "blah1", "b":
> >> "blah2"} instead of lists of dictionaries such as [{"name": "a",
> >> "value": "blah1"},{"name": "b", "value": "blah2"}]. The simple
> >> reason that internally it's a lot easier for us to lookup values:
> >> in the first case I find the value for e.g. key "a", whereas in
> >> the second I have to loop through the list until I find the
> >> dictionary with name == "a", and then get it's "value" field. It's
> >> not too bad in this example, but with many keys/items, nested at
> >> multiple levels, using lists becomes unreasonable. (And no, none
> >> of these come from Django models.) That said, it doesn't seem to
> >> be compatible with django REST framework, mainly because I don't
> >> know the keys beforehand so I can't declare them as fields (though
> >> I do know what the structure that the values take). It also
> >> appears I can't use list serializers (i.e. many=True), as
> >> dictionaries aren't lists.
> >>
> >> *Possible approaches: *From what I've read/tried, I see three
> >> avenues:
> >>
> >>
> >> 1. Dynamically modify fields
> >> <http://www.django-rest-framework.org/api-guide/serializers/#dynamically-modifying-fields>:
> >> I've spent a long time with this by reading the dictionary to
> >> get all the keys and adding them all with the fields=() option. I
> >> thought I had it working, but it turned out it wasn't (it return
> >> null for all keys as soon as one key value wasn't serialized). I'd
> >> prefer to avoid this approach as it seems somewhat complex.
> >> 2. This
> >> <https://groups.google.com/forum/#!searchin/django-rest-framework/serialize$20dictionary/django-rest-framework/f0Qftb1EdWw/vJpiT6zkIvYJ>
> >> question about changing the list method. I prefer this method, but
> >> I wasn't able to get an example working.
> >> 3. Construct something myself from the ground up. I may have
> >> to, but I imagine someone must have come across this same issue
> >> before ...
> >>
> >> Of course, I may have missed something obvious -- in which case my
> >> pride may be a little hurt, but I'll be darn glad to have an
> >> answer!
> >>
> >> So, could someone please enlighten me on the best way to solve this
> >> issue, or give me a nudge in the right direction?
> >>
> >> Thanks,
> >>
> >> Kane
> >>
> > --
> > You received this message because you are subscribed to a topic in
> > the Google Groups "Django REST framework" group.
> > To unsubscribe from this topic, visit
> > https://groups.google.com/d/topic/django-rest-framework/zCyj4S8VJMc/unsubscribe
> > .
> > To unsubscribe from this group and all its topics, send an email to
> > django-rest-fram...@googlegroups.com.
> > For more options, visit https://groups.google.com/d/optout.
> >
>
> --
> You received this message because you are subscribed to the Google
> Groups "Django REST framework" group. To unsubscribe from this group
> and stop receiving emails from it, send an email to

Tom Christie

unread,
Jan 16, 2015, 5:41:13 AM1/16/15
to django-res...@googlegroups.com
> I'm surprised I missed it! I'll have a look to see if we can use this as a base class for essentially returning .init_data by default, and then build optional validators on top of it.

Perfect sensible yes. We are actually planning on including a DictField which would allow for the style of validation you happen to be looking for, but it's not implemented yet. Until then a `BaseSerializer` subclass would be reasonable.

Kane O'Donnell

unread,
Jan 17, 2015, 12:36:19 AM1/17/15
to django-res...@googlegroups.com
Jani, if you read the first post, you'll see my explanation for not structuring our data as you suggest.

Tom, any ETA on the DictField? I believe I've got the following to work, but they're not perfect:
  • EverythingSerializer: as the name suggests, it returns the raw data. You can specify optional fields and serializers. Largely based off the serializers.Serializer class.
  • DictSerializer: I pass this a serializer, and it checks that the value of each key-value pair is serialized by it. So, in the original example, I'd use DictSerializer(value_serializer=PersonSerializer).
  • PrimitiveListSerializer: I just use this to check primitive lists e.g. "[1,2,3]", as I don't believe this is covered either.
If I remember rightly, I subclassed serializers.Serializer in all of them, instead of serializers.BaseSerializer, as they worked out of the box, as such.

We'll see how easy they are to use.

Kane



On Fri, Jan 16, 2015 at 11:41 PM, Tom Christie <christ...@gmail.com> wrote:
> I'm surprised I missed it! I'll have a look to see if we can use this as a base class for essentially returning .init_data by default, and then build optional validators on top of it.

Perfect sensible yes. We are actually planning on including a DictField which would allow for the style of validation you happen to be looking for, but it's not implemented yet. Until then a `BaseSerializer` subclass would be reasonable.

--
Reply all
Reply to author
Forward
0 new messages