how to calculate size of an object

Lynn McGuire

unread,

Oct 14, 2015, 1:38:53 PM10/14/15

to

I would like to calculate the size of a very complex object at runtime. This object has 100 instance variables including several
vectors that contain vectors, etc. Is there an easy way to do this? Obviously sizeof is not going to work.

Thanks,
Lynn

Victor Bazarov

unread,

Oct 14, 2015, 1:47:21 PM10/14/15

to

Since it's very implementation-specific, perhaps the easiest way is to
take a reading of the memory used by your process (using OS-specific
means), then allocate N of those objects, then take another reading
(using the same OS-specific means), then divide the difference between
the two readings by N. Choose different values of N and do more than
one test to get as close to the actual size as possible.

V
--
I do not respond to top-posted replies, please don't ask

Paavo Helde

unread,

Oct 14, 2015, 3:09:13 PM10/14/15

to

Lynn McGuire <l...@winsim.com> wrote in news:mvm3qe$fa5$1...@dont-email.me:

If the data structures grow and reduce during the program and you want to
know the total memory usage associated with a specific object at any random
moment, then pretty much the only way is to calculate it the hard way (by
e.g. adding a specific function to each of your classes which sums up the
dynamic memory usage of members of the class). With some template trickery
this can be automated to some extent for STL containers I believe.

If your 100 member variables do not manage dynamic memory then they are
included in sizeof() of the parent class and do not require extra care.

One question is for which purpose do you need this information? There may
be other simpler means for achieving the goal.

hth
Paavo

bartekltg

unread,

Oct 14, 2015, 4:03:08 PM10/14/15

to

On 14.10.2015 21:08, Paavo Helde wrote:
> Lynn McGuire <l...@winsim.com> wrote in news:mvm3qe$fa5$1...@dont-email.me:
>
>> I would like to calculate the size of a very complex object at
>> runtime. This object has 100 instance variables including several
>> vectors that contain vectors, etc. Is there an easy way to do this?
>> Obviously sizeof is not going to work.
>
> If the data structures grow and reduce during the program and you want to
> know the total memory usage associated with a specific object at any random
> moment, then pretty much the only way is to calculate it the hard way (by
> e.g. adding a specific function to each of your classes which sums up the
> dynamic memory usage of members of the class). With some template trickery
> this can be automated to some extent for STL containers I believe.

Is there any trick to do something with every member of a class?

A tupple contains all members would be enough;)

bartekltg

Paavo Helde

unread,

Oct 14, 2015, 5:18:41 PM10/14/15

to

bartekltg <bart...@gmail.com> wrote in
news:mvmcd0$39l$1...@node2.news.atman.pl:

Not to my knowledge, C++ still lacks full introspection capabilities.

> A tupple contains all members would be enough;)

This would probably involve adding support for std::tuple in language
core (similar to std::initializer_list). Also, I foresee at least some
conceptual problems with private data members.

Cheers
Paavo

mark

unread,

Oct 14, 2015, 5:21:42 PM10/14/15

to

If you only want to do a few test measurements, you can replace
new/delete/malloc/free and keep track the allocated memory.

Lynn McGuire

unread,

Oct 14, 2015, 5:28:25 PM10/14/15

to

I have thousands of dissimilar objects in our software (600 classes, etc).

Currently, I have have one instance of the object in question. The object can be, depending on the size of the model, from 50 KB to
50 MB in size (swag). We are moving to multiple instances of the object in question so I would like to estimate file storage needed.

Thanks,
Lynn

Ian Collins

unread,

Oct 14, 2015, 5:38:52 PM10/14/15

to

Lynn McGuire wrote:
> I would like to calculate the size of a very complex object at
> runtime. This object has 100 instance variables including several
> vectors that contain vectors, etc. Is there an easy way to do this?

Please wrap your lines!

> Obviously sizeof is not going to work.

It would be part of the solution. The easiest solution is to add a size
member function that sums sizeof(the class) and capacity of the
container members.

An alternative to using capacity would be to use a custom allocator for
the container members.

--
Ian Collins

Luca Risolia

unread,

Oct 14, 2015, 5:42:04 PM10/14/15

to

use of sizeof + global operator new / operator new[] replacements are
probably a good starting point

Paavo Helde

unread,

Oct 14, 2015, 6:11:06 PM10/14/15

to

Lynn McGuire <l...@winsim.com> wrote in news:mvmh9f$74a$1...@dont-email.me:

> On 10/14/2015 4:21 PM, mark wrote:
>> On 2015-10-14 19:38, Lynn McGuire wrote:
>>> I would like to calculate the size of a very complex object at
>>> runtime. This object has 100 instance variables including several
>>> vectors that contain vectors, etc. Is there an easy way to do this?
>>> Obviously sizeof is not going to work.
>>
>> If you only want to do a few test measurements, you can replace
>> new/delete/malloc/free and keep track the allocated memory.
>
> I have thousands of dissimilar objects in our software (600 classes,
> etc).
>
> Currently, I have have one instance of the object in question. The
> object can be, depending on the size of the model, from 50 KB to 50 MB
> in size (swag).

50 MB ought be visible pretty well already at the OS level. Just add some
code to make a deep copy of it (should be trivial if these are really
vectors of vectors) and observe the process size immediately before and
after.

> We are moving to multiple instances of the object in
> question so I would like to estimate file storage needed.

File storage? If your object is serializable to a file why don't you just
serialize it and look at the file size?

Victor Bazarov

unread,

Oct 14, 2015, 7:54:17 PM10/14/15

to

On 10/14/2015 6:08 PM, Stefan Ram wrote:

> Lynn McGuire <l...@winsim.com> writes:
>> I would like to calculate the size of a very complex object at runtime.

>> Obviously sizeof is not going to work.
>

> »5.3.3 Sizeof [expr.sizeof]
>
> 1 The sizeof operator yields the number of bytes in the
> object representation of its operand.«
>
> If »Obviously sizeof is not going to work«, then
> »size of an object« might have a special meaning
> to you.

When object allocates dynamic memory, the term is not the "size" but
rather "footprint". std::vector<int> has the same "size" with and
without any elements if obtained using "sizeof". Yet, you can't really
deny that the footprints in RAM of vector<int>() and vector<int>(1000)
is different.

Lynn McGuire

unread,

Oct 14, 2015, 8:00:44 PM10/14/15

to

That might work. I've got a few caveats that I would have to decide about. We use string compression in our objects for one and I
would need to decide whether I want the compressed or uncompressed size.

Thanks,
Lynn

Jorgen Grahn

unread,

Oct 15, 2015, 2:58:31 AM10/15/15

to

Seems to me that the best thing to do, after all, could be to bite the
bullet and write a function "estimated_disk_size(const Foo&)". And to
be prepared to modify it as the class changes.

(I wonder if serialisation frameworks tend to have that functionality?
To not write anything but just report the size that would have been
written? I guess you could to it by serialising to some /dev/null-like
stream ...)

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .

Christian Gollwitzer

unread,

Oct 15, 2015, 4:17:37 AM10/15/15

to

Am 14.10.15 um 19:38 schrieb Lynn McGuire:

You could try a serialization framework. I know of one that is available
online..... SCNR

Christian

bartekltg

unread,

Oct 15, 2015, 11:53:01 AM10/15/15

to

Oh, so you have some sort of serialization procedure,
it has to iterate over all members recursively.
Rewrite it to only count the size.

If you use streams, probably you can write you own dummy stream
that only counts bytes.

bartekltg

woodb...@gmail.com

unread,

Oct 15, 2015, 12:07:24 PM10/15/15

to

The C++ Middleware Writer (CMW) used to have support for that
but I disabled it a few years ago. I used to pre-calculate
the "marshalling size" of messages. Then I'd marshall that size
followed by the message itself.

What I do now is reserve 4 bytes for the message size,
marshall the message into a buffer, and then calculate
the size of the message and place the size before the
message itself.

Here's an example where the message is just a message id.

inline void Marshal (::cmw::SendBuffer& buf
,messageid_t const& az1
,int32_t max_length=10000){
try{
buf.ReserveBytes(4);
buf.Receive(az1);
buf.FillInSize(max_length);
}catch(...){buf.Rollback();throw;}
}

There are pros and cons to both approaches. One con for
pre-calculating the size is that you have to iterate over
the message elements two times in order to marshal a message.
Another con was users of my software had to add an additional
function prototype, something like CalculateMarshallingSize,
to their classes.

A pro for pre-calculating is if the size of the message
exceeds your maximum for that message, you figure that out
before marshalling potentially a lot of data. I didn't find
that to be very compelling though so moved to the other
approach.

Brian
Ebenezer Enterprises - In G-d we trust.
http://webEbenezer.net

Lynn McGuire

unread,

Oct 15, 2015, 12:25:15 PM10/15/15

to

Our serialization code does report the size of the data being written to
the file.

Thanks,
Lynn

Lynn McGuire

unread,

Oct 15, 2015, 12:27:42 PM10/15/15

to

Yes, doesn't everyone have serialization? Our old code reports bytes
written to the file as a matter of error detection. The new code
(written in the last 20 years) does not. It would be simple to extend
though.

Thanks,
Lynn

Juha Nieminen

unread,

Oct 15, 2015, 7:10:47 PM10/15/15

to

Lynn McGuire <l...@winsim.com> wrote:
> I would like to calculate the size of a very complex object at runtime. This object has 100 instance variables including several
> vectors that contain vectors, etc. Is there an easy way to do this? Obviously sizeof is not going to work.

On most unix systems you can use sbrk() (in <unistd.h>) to figure out
how much the amount of allocated heap changes. (In other words, you
first get the pointer returned by sbrk(0), then allocate a bunch of
stuff, and then get a new pointer from sbrk(0), and the difference
between these two pointers is how much the heap has increased.)

Of course this can be a rough estimate (because C runtimes usually
somewhat overallocate), but it will give a relatively good estimate
of the total size of the allocated data (especially if there is a lot
of it.)

--- news://freenews.netfront.net/ - complaints: ne...@netfront.net ---

Scott Lurndal

unread,

Oct 16, 2015, 9:19:12 AM10/16/15

to

Juha Nieminen <nos...@thanks.invalid> writes:
>Lynn McGuire <l...@winsim.com> wrote:
>> I would like to calculate the size of a very complex object at runtime. This object has 100 instance variables including several
>> vectors that contain vectors, etc. Is there an easy way to do this? Obviously sizeof is not going to work.
>
>On most unix systems you can use sbrk() (in <unistd.h>) to figure out
>how much the amount of allocated heap changes. (In other words, you
>first get the pointer returned by sbrk(0), then allocate a bunch of
>stuff, and then get a new pointer from sbrk(0), and the difference
>between these two pointers is how much the heap has increased.)
>

On linux, /proc/<pid>/maps has the heap size:

$ cat /proc/1799/maps
00400000-00561000 r-xp 00000000 fd:00 2061943 /vsim
00760000-00764000 rw-p 00160000 fd:00 2061943 /vsim
00764000-008e4000 rw-p 00000000 00:00 0
00d51000-01f89000 rw-p 00000000 00:00 0 [heap]
...

Along with everything else:

$ cat /proc/1799/maps | wc -l
311

Richard

unread,

Oct 20, 2015, 1:01:08 PM10/20/15

to

[Please do not mail me a copy of your followup]

Lynn McGuire <l...@winsim.com> spake the secret code
<mvm3qe$fa5$1...@dont-email.me> thusly:

>I would like to calculate the size of a very complex object at runtime.

Obviously calculating the size of an object is not useful in and of
itself.

What is it you are really trying to accomplish?
--
"The Direct3D Graphics Pipeline" free book <http://tinyurl.com/d3d-pipeline>
The Computer Graphics Museum <http://computergraphicsmuseum.org>
The Terminals Wiki <http://terminals.classiccmp.org>
Legalize Adulthood! (my blog) <http://legalizeadulthood.wordpress.com>

Jorgen Grahn

unread,

Oct 20, 2015, 1:07:41 PM10/20/15

to

On Tue, 2015-10-20, Richard wrote:
> [Please do not mail me a copy of your followup]
>
> Lynn McGuire <l...@winsim.com> spake the secret code
> <mvm3qe$fa5$1...@dont-email.me> thusly:
>
>>I would like to calculate the size of a very complex object at runtime.
>
> Obviously calculating the size of an object is not useful in and of
> itself.
>
> What is it you are really trying to accomplish?

You're right, but I think that was covered later in the thread:
finding out how much disk space the serialised representation of the
object would require.

Lynn McGuire

unread,

Oct 20, 2015, 1:29:54 PM10/20/15

to

On 10/20/2015 12:07 PM, Jorgen Grahn wrote:
> On Tue, 2015-10-20, Richard wrote:
>> [Please do not mail me a copy of your followup]
>>
>> Lynn McGuire <l...@winsim.com> spake the secret code
>> <mvm3qe$fa5$1...@dont-email.me> thusly:
>>
>>> I would like to calculate the size of a very complex object at runtime.
>>
>> Obviously calculating the size of an object is not useful in and of
>> itself.
>>
>> What is it you are really trying to accomplish?
>
> You're right, but I think that was covered later in the thread:
> finding out how much disk space the serialised representation of the
> object would require.
>
> /Jorgen

Yup.

Thanks,
Lynn

Lynn McGuire

unread,

Oct 20, 2015, 4:47:09 PM10/20/15

to

I am writing the multiple large object serialization code right now. I will know soon of the effect of moving from one large object
to multiple large objects on our binary file size. My concern is that we may need to compress the entire binary file to keep it a
reasonable size.

Lynn

Ian Collins

unread,

Oct 20, 2015, 4:51:36 PM10/20/15

to

Lynn McGuire wrote:
>
> I am writing the multiple large object serialization code right now.
> I will know soon of the effect of moving from one large object to
> multiple large objects on our binary file size. My concern is that
> we may need to compress the entire binary file to keep it a
> reasonable size.

Define reasonable!

Depending on where you deploy it, you can just pipe the output through
gzip, or (as I do) use a compressed filesystem.

--
Ian Collins

Paavo Helde

unread,

Oct 20, 2015, 5:26:27 PM10/20/15

to

Lynn McGuire <l...@winsim.com> wrote in news:n0693v$hbp$1...@dont-email.me:

> I am writing the multiple large object serialization code right now.
> I will know soon of the effect of moving from one large object to
> multiple large objects on our binary file size. My concern is that we
> may need to compress the entire binary file to keep it a reasonable
> size.

Yes, what is reasonable? I understand currently storing a 100 kB file is
pretty reasonable and storing a 100 GB file is probably not. However,
compression will reduce the data size only ca 2-10 times, depending on
content, so in this sense there is not much difference. Next year the disks
etc get 10 times bigger and the borders of "reasonable" will change.

Anyway, if you want to add transparent compression, look up gzopen().

Cheers
Paavo

Lynn McGuire

unread,

Oct 20, 2015, 7:42:46 PM10/20/15

to

Any binary file larger than 20+ MB has trouble going through most email systems. Just about all of our clients like to email files
back and forth. We would have to rewrite our serialization code for files larger than 4 GB (our serialization code was first written
in 1991). I do have a 1.0 GB file for testing, it is probably 10 GB fully expanded.

We already have a builtin utility, zlib, for string compression already since we store large strings in our serialization file.

Thanks,
Lynn

seeplus

unread,

Oct 20, 2015, 7:47:40 PM10/20/15

to

On Wednesday, October 21, 2015 at 8:26:27 AM UTC+11, Paavo Helde wrote:
Next year the disks
> etc get 10 times bigger and the borders of "reasonable" will change.
>

> Cheers
> Paavo

A problem is we have quickly all gone SSD, and so far for a drive which
is about (2*$) of a current spinning disk, the SSD capacity is /4 smaller.

Lynn McGuire

unread,

Oct 20, 2015, 8:20:41 PM10/20/15

to

My primary drive is a Intel 480 GB SSD that I paid $200 for (currently $190).
http://www.amazon.com/gp/product/B00UL510VM

Yup, your ratios are just about right for this drive.

Lynn

Ian Collins

unread,

Oct 20, 2015, 9:48:30 PM10/20/15

to

Which is why native filesystem compression is a good idea!

--
Ian Collins

Paavo Helde

unread,

Oct 21, 2015, 2:30:54 AM10/21/15

to

Lynn McGuire <l...@winsim.com> wrote in news:n06jd6$opg$1...@dont-email.me:

> On 10/20/2015 4:26 PM, Paavo Helde wrote:
>> Anyway, if you want to add transparent compression, look up gzopen().
>

> We already have a builtin utility, zlib, for string compression
> already since we store large strings in our serialization file.

Good, gzopen() is part of zlib. There are also C++ wrappers like gzstream
for wrapping zlib as a standard iostream interface.

Serialized binary arrays often contain lots of zero bytes which can be
easily compressed.

hth
Paavo

Jorgen Grahn

unread,

Oct 21, 2015, 4:08:33 AM10/21/15

to

On Tue, 2015-10-20, Paavo Helde wrote:
> Lynn McGuire <l...@winsim.com> wrote in news:n0693v$hbp$1...@dont-email.me:
>> I am writing the multiple large object serialization code right now.
>> I will know soon of the effect of moving from one large object to
>> multiple large objects on our binary file size. My concern is that we
>> may need to compress the entire binary file to keep it a reasonable
>> size.
>
> Yes, what is reasonable? I understand currently storing a 100 kB file is
> pretty reasonable and storing a 100 GB file is probably not. However,
> compression will reduce the data size only ca 2-10 times, depending on
> content,

To be pedantic, 2--100 times is more like it. I have fairly
frequently handled large files which contain useful information /and/
compress that well. E.g. periodic numerical samples, overly verbose
log files ...

> so in this sense there is not much difference. Next year the disks
> etc get 10 times bigger and the borders of "reasonable" will change.
>
> Anyway, if you want to add transparent compression, look up gzopen().

Yes. It's my rule of thumb that defining a "bloated" file format and
then gzipping it is often better than trying to save bits in the file
format itself.

Scott Lurndal

unread,

Oct 21, 2015, 8:52:00 AM10/21/15

to

Or use libz directly.

Robert Wessel

unread,

Oct 21, 2015, 10:54:34 AM10/21/15

to

On 21 Oct 2015 08:08:13 GMT, Jorgen Grahn <grahn...@snipabacken.se>
wrote:

I agree - some of our stuff can produce diagnostic traces that tend to
compress very well, but the main redundancy is at a higher level than
you can easily deal with in the file format - the calling application
does the same thing over and over, each time with a very similar set
of calls, parameters and returns. General purpose compression on the
back end captures that nicely, while also providing an effectively
"dense" format for each trace entry. And since that's buried in the
low-level trace output code, there's no need to add the complexity of
a dense format to each program that produces a trace.

Of course when the compressed traces are multi-GB (as they
occasionally are), they still don't email well, so we have an FTP site
for the customers to upload those.