pull request: write feeds with ordered attributes

74 views
Skip to first unread message

Georg Sauthoff

unread,
Feb 11, 2017, 9:55:24 AM2/11/17
to django-d...@googlegroups.com
(please CC me)

Hello,

I recently created this small pull request:

write feeds with ordered attributes (#8044)
https://github.com/django/django/pull/8044

Tim Graham suggested that I also write to the mailing list to
reach a consensus.

Thus, please have a look.

Best regards
Georg
--
'The type of mkM looks somewhat scary, but it simply explains all the
type-representation constraints that are needed for type-safe cast.'
(R. Laemmel, S. P. Jones. Scrap your boilerplate: a practical approach to
generic programming. Proc. TLDI, 2003)

Adam Johnson

unread,
Feb 11, 2017, 3:15:50 PM2/11/17
to django-d...@googlegroups.com
I can see the advantage from an operational perspective with files matching byte-for-byte. I know many API's do the same with sorting the keys in their JSON output for the same reason.

I should think the performance impact isn't too great, but would be nice to see some benchmarking to prove it's not disastrous.

--
You received this message because you are subscribed to the Google Groups "Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscribe@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/20170211113602.GE2638%40dell12.lru.li.
For more options, visit https://groups.google.com/d/optout.



--
Adam

Georg Sauthoff

unread,
Feb 12, 2017, 11:27:29 AM2/12/17
to django-d...@googlegroups.com
On Sat, Feb 11, 2017 at 08:15:08PM +0000, Adam Johnson wrote:
> I can see the advantage from an operational perspective with files matching
> byte-for-byte. I know many API's do the same with sorting the keys in their
> JSON output for the same reason.

> I should think the performance impact isn't too great, but would be nice to
> see some benchmarking to prove it's not disastrous.

I did a small benchmark, and as expected, the impact is really small.

Without the pull-request, generating an Atom feed with 100 medium sized
entries takes ~ 0.01 s on consumer hardware. The sorting and
creation of OrderedDicts increases the runtime by ~ 27 %.

Note that this benchmark generates the same feed several times in a row
because the absolute runtime is so small. See
https://gist.github.com/gsauthof/e2787adc7e9a46811dcb6aff2054110c for
details.

It pays off to only sort+create OrderedDict if there are attributes,
e.g.:

def startElement(self, name, attrs):
xs = collections.OrderedDict(sorted(attrs.items())) if attrs else attrs
super().startElement(name, xs)

That version yields an ~ 18 % runtime increase.

We can get down to ~ 7 % runtime increase if we - in feedgenerator.py - replace
the attribute dictionaries with a simple wrapper around a list that provides
the items() method, e.g.:

class Atts:
def __init__(self, items):
self.xs = items
def items(self):
return self.xs

And then add the elements with attributes like this:

handler.addQuickElement("atom:link", None,
Atts([("rel", "self"), ("href", self.feed['feed_url'])]))

But IMHO, this would be a premature optimization that just
obfuscates the feedgenerator code.

Even the conditional sorting is hardly necessary outside of
microbenchmarking scenarios.

Best regards
Georg
--
'One must not put a loaded rifle on the stage if no one is
thinking of firing it.' ( Anton Chekhov, 1889 )

Adam Johnson

unread,
Feb 12, 2017, 5:02:01 PM2/12/17
to django-d...@googlegroups.com
Ok, looks like the second version (conditionally sorting) is the most sensible to go with.

--
You received this message because you are subscribed to the Google Groups "Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscribe@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.

For more options, visit https://groups.google.com/d/optout.



--
Adam

Georg Sauthoff

unread,
Feb 13, 2017, 5:27:51 AM2/13/17
to django-d...@googlegroups.com
On Sun, Feb 12, 2017 at 10:01:28PM +0000, Adam Johnson wrote:
> Ok, looks like the second version (conditionally sorting) is the most
> sensible to go with.

Ok, I've rebased the pull request such that it includes the conditional
sorting:

https://github.com/django/django/pull/8044

Best regards
Georg
Reply all
Reply to author
Forward
0 new messages