Platform Team Week of October 20th, 2014

21 views
Skip to first unread message

Andy Parker

unread,
Oct 23, 2014, 2:53:51 PM10/23/14
to puppe...@googlegroups.com
** Next PR Triage Wednesday, October 29th @ 10:00 am Pacific. **

Priorities
  1. Puppet 3.7.3
  2. CFacter on the march
  3. New puppet doc implementation
  4. Code removal for puppet 4
Commentary
Puppet 3.7.2 is released! There are a couple tickets that didn't make the boat (PUP-3302, PUP-3500, PUP-3505, maybe others) and so expect a puppet 3.7.3. 

Peter and Michael are continuing on cfacter. Solaris support is still in the works.

Puppet strings 0.1.1 has been released! It fixes some typos, a bug when future parser is already enabled, and provides some examples of how to use the markup.

Code removals are plowing ahead. The master branch's lib directory is now at 98 KLOC whereas stable is at 101 KLOC.

I'm working on a plan to transition from YAML and PSON to JSON for serialization (network and on disk). There are several changes around encoding that will need to be handled in order to keep some functionality (file content with binary data, non-UTF-8 data in reports). I'm going to be sending out an email to puppet-dev soon.

Puppet-dev conversations of note:
  * Behavior of apply + ENC
    The discussion continues at a trickle. Felix has summarized the pros and cons of the various options. Please help drive it to a conclusion.
  * Merits of directory environments and opt-in fallback mode?
    The thread has fizzled out. I'm interested in whether others see value in the proposal.

Data

I have some data about profiling!

I got two different data sets. They were pretty comparable in many ways. Data Set 1 (DS1) catalogs took ~11 seconds on average, Data Set 2 (DS2) catalogs ~14 seconds. One of the big differences in the setups where that DS2 was using the MessagePack serialization whereas DS1 was using the default PSON. The average serialization time for each dataset showed a very strange outcome: PSON took an average of ~0.03 seconds, and msgpack took ~0.05 seconds. I plotted a histogram of the serialization times and that showed that there was a huge spike in both datasets on the very low end. This skewed the average times substantially.

By excluding times <= 0.003 seconds we get

MessagePack
Inline image 1

PSON
Inline image 2

This indicates that there is actually a benefit to using msgpack. The high end of serialization is cut off. There is a large difference in the number of measurements in the two datasets, so it is still possible that msgpack is not as large of a difference as we expect. We are not going to switch to msgpack however. Instead we are going to switch to JSON, but the performance differences should be similar. According to various benchmarks that I've found online (and don't have links to right now), the difference between msgpack and json is negligible.


--
Andrew Parker
Freenode: zaphod42
Twitter: @aparker42
Software Developer

Join us at PuppetConf 2015, October 5-9 in Portland, OR - http://2015.puppetconf.com 
Register early to save 40%!

Erik Dalén

unread,
Oct 23, 2014, 7:03:24 PM10/23/14
to Puppet Developers
In my testing the difference in bigger with larger catalogs. When I test it on one of our catalogs (1985 edges and 1026 resources) I get much more speed difference comparing JSON and MessagePack on ruby 2.0.0 (included PSON as well for reference).

For serializing I get this:

Encode
======
Date: October 24, 2014

System Information
------------------
    Operating System:    OS X 10.10 (14A389)
    CPU:                 Intel Core i7 3.5 GHz
    Processor Count:     4
    Memory:              16 GB
    ruby 2.0.0p451 (2014-02-24 revision 45167) [x86_64-darwin13.2.0]

"MessagePack.pack" is up to 97% faster over  repetitions
--------------------------------------------------------

    MessagePack.pack    0.364662  secs    Fastest
    JSON.generate       2.711173  secs    86% Slower
    PSON.generate       14.690753 secs    97% Slower

And for deserializing I get this:

Decode
======
Date: October 24, 2014

System Information
------------------
    Operating System:    OS X 10.10 (14A389)
    CPU:                 Intel Core i7 3.5 GHz
    Processor Count:     4
    Memory:              16 GB
    ruby 2.0.0p451 (2014-02-24 revision 45167) [x86_64-darwin13.2.0]

"MessagePack.unpack" is up to 92% faster over  repetitions
----------------------------------------------------------

    MessagePack.unpack    1.581809  secs    Fastest
    JSON.parse            3.565996  secs    55% Slower
    PSON.parse            19.935312 secs    92% Slower

--
Erik Dalén

Andy Parker

unread,
Oct 23, 2014, 8:08:43 PM10/23/14
to puppe...@googlegroups.com
Thanks for these numbers! MessagePack does seem to be a bit faster. It is more than I was expecting. There are other ruby JSON libraries that we can use that will probably get even closer to MessagePack in performance. Oj is one that has been brought to my attention. On puppet-server it could make use of jrjackson, which is backed by the Jackson library, which is pretty fast.
 
--
Erik Dalén

--
You received this message because you are subscribed to the Google Groups "Puppet Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to puppet-dev+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-dev/CAAAzDLcrF%2BZ6Xe%3DTNUhZ__dT6%3DXdqqxGBs%3Dx6oTQHD2yqCztmQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages