[Wrappers vs Primitives] memory footprints measurements

243 views
Skip to first unread message

Eugene Morozov

unread,
Oct 7, 2013, 3:45:11 PM10/7/13
to mechanica...@googlegroups.com
Hello! 
I have a bunch of POJO's that are build during preloading phase on start of an application. Most of clases' fields of these POJOs are wrappers (Integers, Doubles, Booleans, etc). I'd really like to measure a difference - if that takes place - if those fields are primitives. I've tried to use Caliper, but for me it looks like Caliper doesn't understand it well (or it is probably me, who doesn't). 
In my mind for one POJO there would be several Integers, several Booleans, so that leads to [one + several + several] Objects in memory with headers, alignments, references, etc. instead of having just one POJO. Of course in result there would be exactly same amount of information, but there must be quite an overhead for all of these references, but Caliper didn't show that.

To tell the truth, I'm sure it's not the way to go with all that legacy stuff to exchange wrappers for primitives to fix memory footprints, but I'd really like to understand how that works and possible. That alone is interesting by itself.

So, the question is, what is the best way to measure such a stuff? Could you recommend to read something or watch a conference or smth regarding the subject.

Thanks in advance.

Peter Lawrey

unread,
Oct 7, 2013, 4:22:27 PM10/7/13
to mechanica...@googlegroups.com

You can measure how memory is used to create the object and test.  Also you have to test multiple objects just as you would have in a real application.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Peter Lawrey

unread,
Oct 7, 2013, 4:24:24 PM10/7/13
to mechanica...@googlegroups.com

Btw wrappers vs primitives can be a factor of ~3x larger/slower but the key problem is making a field nullable for no good reason.

Eugene Morozov

unread,
Oct 7, 2013, 6:30:02 PM10/7/13
to mechanica...@googlegroups.com
Peter, thanks for the answer, but how exactly should I measure that?
That's exactly my point, which I'd like to check. I do not see these ~3x large/slower.

I've just tried another google's tool called "memory measurer", which gave me same results as google's Caliper (I believe they use one inside the other). 

And the result is that the difference is ~5% measured in bytes and ~7% measured in number of references. Now I'd like to be sure there is no mystery behind it.

What else can I do to make sure / check it?

--
Be well!
Jean Morozov

Peter Lawrey

unread,
Oct 8, 2013, 1:34:44 AM10/8/13
to mechanica...@googlegroups.com

Can you show the test you have? Much depends on exactly what you are doing.

Peter Lawrey

unread,
Oct 8, 2013, 1:53:11 AM10/8/13
to mechanica...@googlegroups.com
Here is an example which prints

primitives avg 11.1, wrappers avg 67.2, ratio: 6.077
primitives avg 1.3, wrappers avg 33.0, ratio: 24.954
primitives avg 1.2, wrappers avg 25.6, ratio: 21.720
primitives avg 1.3, wrappers avg 23.3, ratio: 18.285
primitives avg 1.2, wrappers avg 24.1, ratio: 20.342

----


    public static final int RUNS = 1000000;

    public static void main(String[] args) {
        for (int i = 0; i < 5; i++) {
            Primitives p = new Primitives();
            double time1 = timePrimitives(p);
            Wrappers w = new Wrappers();
            double time2 = timeWrapper(w);
            System.out.printf("primitives avg %.1f, wrappers avg %.1f, ratio: %.3f%n",
                    time1, time2,  time2 / time1);
        }
    }

    private static double timePrimitives(Primitives p) {
        long start = System.nanoTime();
        for (int i = 0; i < RUNS; i++)
            p.increment();
        return ((double)(System.nanoTime() - start))/ RUNS;
    }

    private static double timeWrapper(Wrappers w) {
        long start = System.nanoTime();
        for (int i = 0; i < RUNS; i++)
            w.increment();
        return ((double)(System.nanoTime() - start))/ RUNS;
    }

    static class Primitives {
        byte b;
        short s;
        char c;
        int i;
        float f;
        long l;
        double d;

        public void increment() {
            b++;
            s++;
            c++;
            i++;
            f++;
            l++;
            d++;
        }
    }

    static class Wrappers {
        Byte b = 0;
        Short s = 0;
        Character c = '\0';
        Integer i = 0;
        Float f = 0.0f;
        Long l = 0L;
        Double d = 0.0;


        public void increment() {
            b++;
            s++;
            c++;
            i++;
            f++;
            l++;
            d++;
        }
    }


Nitsan Wakart

unread,
Oct 8, 2013, 2:11:09 AM10/8/13
to mechanica...@googlegroups.com
Something to note about Boolean/Integer(others too) is that to preserve space there is a pool of constants replacing boxed constants(see here: http://stackoverflow.com/questions/13098143/java-integer-constant-pool). If your test is only using low value Integers, and the Integers take up a significant amount of your fields this can significantly reduce the memory overhead. Will be easier to tell if you posted the object in question...


From: Peter Lawrey <peter....@gmail.com>
To: mechanica...@googlegroups.com
Sent: Tuesday, October 8, 2013 7:34 AM
Subject: Re: [Wrappers vs Primitives] memory footprints measurements

Peter Lawrey

unread,
Oct 8, 2013, 2:21:37 AM10/8/13
to mechanica...@googlegroups.com
Another test prints the following.  This shows how long it takes the code to run cold.  Note: how much slower it is than the hot code in the previous test.

primitives avg 355.4, wrappers avg 2018.4, ratio: 5.679
primitives avg 364.7, wrappers avg 1240.8, ratio: 3.403
primitives avg 288.2, wrappers avg 1982.3, ratio: 6.879
primitives avg 343.0, wrappers avg 1950.6, ratio: 5.688
primitives avg 338.5, wrappers avg 1957.3, ratio: 5.783

prints

    public static final int RUNS = 5000;

    public static void main(String[] args) throws InterruptedException {
        Primitives p = new Primitives();
        Wrappers w = new Wrappers();
        for (int i = 0; i < 20; i++) {
            timePrimitives(p, true);
            timeWrapper(w, true);
        }

        for (int i = 0; i < 5; i++) {
            double time1 = timePrimitives(p, false);
            double time2 = timeWrapper(w, false);
            System.out.printf("primitives avg %.1f, wrappers avg %.1f, ratio: %.3f%n",
                    time1, time2, time2 / time1);
        }
    }

    private static double timePrimitives(Primitives p, boolean warmup) throws InterruptedException {
        long total = 0;
        for (int i = 0; i < RUNS; i++) {
            Thread.sleep(warmup ? 0 : 1);
            long start = System.nanoTime();
            p.increment();
            long time = System.nanoTime() - start;
            total += time;
        }
        return (double) total / RUNS;
    }

    private static double timeWrapper(Wrappers w, boolean warmup) throws InterruptedException {
        long total = 0;
        for (int i = 0; i < RUNS; i++) {
            Thread.sleep(warmup ? 0 : 1);
            long start = System.nanoTime();
            w.increment();
            long time = System.nanoTime() - start;
            total += time;
        }
        return (double) total / RUNS;

Eugene Morozov

unread,
Oct 8, 2013, 5:20:27 AM10/8/13
to mechanica...@googlegroups.com
Here is my example.

Looks like I see kind of real difference in number of references after I've changed small ids to big ones (exchange 1s to 1000s). Lite version has 41 reference less, than default one.

Lite version
size: 2520 bytes
Footprint{Objects=62, References=113, Primitives=[long x 3, int x 116, short, char x 268, boolean x 15, float x 3, double]}

Default version
size: 2640 bytes
Footprint{Objects=70, References=154, Primitives=[long x 3, int x 81, short, char x 268, boolean x 14, double]}

Now I'd like to somehow measure the GC benefit.


Lite version contains primitives where possible instead of wrappers.

public static void main(String[] args) {
    PojoLite pojoLite = buildPojoLite();
    Pojo pojo = buildPojo();

    System.out.println(MemoryMeasurer.measureBytes(pojoLite));
    System.out.println(MemoryMeasurer.measureBytes(pojo));

    System.out.println(ObjectGraphMeasurer.measure(pojoLite));
    System.out.println(ObjectGraphMeasurer.measure(pojo));
}

public class PojoLite {
    protected static final IntOpenHashSet TYPES = new IntOpenHashSet() {{
        add(6); add(7); add(8); add(16); add(21);
    }};

    private int id;
    private int productId;
    private String upc;
    private PLite price;
    private ALite availability;
    private String indicator;
    private String code;
    private String description;
    private String ccode;
    private boolean fee;
    private double fee2;
    private boolean history;
    private boolean orderable;
    private String model;
    private DRLite period;
    private boolean display;
}

--
Be well!
Jean Morozov


Peter Lawrey

unread,
Oct 8, 2013, 5:28:34 AM10/8/13
to mechanica...@googlegroups.com
Those numbers seem reasonable for a class with a mixed on primitives and objects.  The performance difference might not be great, but IMHO it is much clearer that those fields cannot be null. This meakes it easier to reason about the code (and remove any null checks you might have had)
I would be tempted to add @NotNull to any other fields which cannot be null. (Also make any fields final if possible)

Rüdiger Möller

unread,
Oct 8, 2013, 6:27:00 AM10/8/13
to
Once you use higher value Boxed Numbers (no VM cache) and keep your Pojo's permanently (OldSpace), the impact on application performance will worsen. Additionally you get poor localilty as the Boxed Integers are located at a different location than the container pojo. This gets worse by the time, worst case Integer's referenced by your pojo are cluttered all across the heap. This gets even worse if you use Non-Compacting Collectors such as CMS. 
If you perform business logic on those Boxed numbers, you'll create needless garbage filling up your CPU caches, which again hit processing performance.

FullGC duration is a direct function of number of Objects (and locality).

I've charted this here

http://java-is-the-new-c.blogspot.de/2013/07/what-drives-full-gc-duration-its.html

Boxed numbers create overhead and hurt performance in many (subtle) ways. The design of Java-Generics (+Autoboxing) was one of the most stupid design decisions ever. The effects will worsen if the gap between cache hits and memory access widens (which is likely). 

Eugene Morozov

unread,
Oct 8, 2013, 5:24:28 PM10/8/13
to mechanica...@googlegroups.com
Peter, thanks for your time and valuable comments.
Rüdiger, thanks a lot for the charts, you've got really interesting result and kind of questionable conclusion! =)


It seems that in our case it's not a big deal to have primitives vs wrappers.
There are preloading phase and request phase. 
* During preloading each entity is created and pushed to distributed storage, GC is good to go with all of them and probably at once.
* For request we get these out of the storage (they instantiated), we group them along the way to different kind of collections, they pass through up to web-tier and again may be vanished by GC. 
In both cases I believe there are not so much of them, they are small and which seems the most important each entity is close to all of its wrappers, strings, arrays, etc.
In both cases most of them are short-lived.
Concern here is preloading phase, which requires 10g of heap and seems to overwhelmed with different kind of Hash[Maps, Sets] and other kind of heavy data structures.


So, in terms of real memory consumption both these cases are almost equal.
In terms of number of references wrappers have 30% more references in my case.
The only concern is CMS GC, but most of them die in Eden.
Conclusion out of this is that I'm sure that's definitely not worth nobody's time to exchange wrappers for primitives (I had some doubts, when I started) =)

Thanks.

--
Be well!
Jean Morozov


On Tue, Oct 8, 2013 at 2:15 PM, Rüdiger Möller <moru...@gmail.com> wrote:
Once you use higher value Boxed Numbers (no VM cache) and keep your Pojo's permanently (OldSpace), the impact on application performance will worsen. Additionally you get poor localilty as the Boxed Integers are located at a different location than the container pojo. This gets worse by the time, worst case Integer's referenced by your pojo are cluttered all across the heap. This gets even worse if you use Non-Compacting Collectors such as CMS.

FullGC duration is a direct function of number of Objects (and locality).

I've charted this here

Am Montag, 7. Oktober 2013 21:45:11 UTC+2 schrieb Eugene Morozov:

--

Eugene Morozov

unread,
Oct 8, 2013, 5:24:47 PM10/8/13
to mechanica...@googlegroups.com
If somebody would like to know, there are bunch of tools to measure memory consumption:

1. Google's Caliper [] (which is more like perf tool, but may measure Object's memory, too)
2. Objects layout [https://github.com/shipilev/java-object-layout]. Unfortunately doesn't allow to walk objects graph, but still gives an interesting result.
3. Google's Memory measurer [https://code.google.com/p/memory-measurer/]. Allows to calculates number of instantiated objects, references and all kind of primivites types in given Object.
4. Google's Allocation Instrumenter [https://code.google.com/p/java-allocation-instrumenter/]. Which hangs a hook on objects creation (using asm) and allows to get custom fine grain analysis.

--
Be well!
Jean Morozov


Rüdiger Möller

unread,
Oct 9, 2013, 4:50:01 AM10/9/13
to mechanica...@googlegroups.com
Well, if those Objects don't survive Eden, things look better ofc. Anyway you'll get a throughput hit, as allocation of wrappers makes newgen happens more frequent + some negative effect on CPU cache (shouldn't be that significant in your scenario).
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

Eugene Morozov

unread,
Oct 9, 2013, 9:32:35 AM10/9/13
to mechanica...@googlegroups.com
I totally agree, but nobody in their mind will allow such a changes even on a mid size enterprise application. 
Unfortunately I'm working on big size one. 

Moving wrappers to primitives would cost a month of work (or even more) and will lead to virtually rewritten system.
With unfortunately vague result.

--
Be well!
Jean Morozov


To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages