get the total memory used by a data structure?

2,303 views
Skip to first unread message

Robert McIntyre

unread,
Dec 22, 2010, 11:51:48 PM12/22/10
to Clojure
I think it would be really cool to have a function that gives the
total number of bytes that a data structure consumes.

so something like:

(total-memory [1 2 [1 2]])

would return however many bytes this structure is.

Is there already something like this around? Would it be hard to write
to work for any clojure data structure?

just like (time), I think (total-memory) would be very enlightening
for basic testing.

sincerely,

--Robert McIntyre

Alan

unread,
Dec 23, 2010, 2:50:04 AM12/23/10
to Clojure
This can't really be done in a language that transparently handles
pointers for you - who's "responsible" for some data that's pointed to
by four different pointers?

(let [a [1 2 [1 2]]
b [2 [1 2]]
c (next a)]
(map total-memory [a b c]))

What does this return? b and c are structurally identical, but c is
taking up less memory because really it's just a pointer into a
subvector of a. If you say that c should print the same size as b -
just pretend data sharing doesn't exist - then this feature becomes
misleading if you use it to compare how much space algorithms require.

(defn sum1 [list]
(loop [data (into-array Integer/TYPE list) acc 0]
(if (zero? (alength data))
acc
(recur (into-array Integer/TYPE (next data)) (+ acc (first
data))))))

(defn sum2 [list]
(loop [data (seq list) acc 0]
(if-not data
acc
(recur (rest data) (+ acc (first data))))))

If you added some code to add up the total memory taken by every
object in each iteration of the loop, you would get the wrong results.
An array of N ints definitely takes less space than a vector of those
same ints - it doesn't have to store type information, or be ready to
grow if you need it to, or...

But the vector is getting transparently reused for you, while the
array is getting copied every time. Obviously this example is
intentionally poorly-written, but it would be easy to write something
bad and discover that it's "better" than the right way.

So I'm not sure what benefit all this would have. But it's pretty easy
to do:

(defmulti sizeof class)
(defmethod sizeof Number ([_] 4)) ; or decide by size if you prefer
(defmethod sizeof java.util.Collection
([coll]
(reduce + 4 (map sizeof (seq coll)))))
(defmethod sizeof clojure.lang.ISeq
([coll]
(reduce + 4 (map sizeof (seq coll)))))

(sizeof [1 2 [1 2]])
;=> 24

Just add methods for more things you want to count.

Remco van 't Veer

unread,
Dec 23, 2010, 2:58:47 AM12/23/10
to clo...@googlegroups.com
On 2010/12/23 05:51, Robert McIntyre wrote:

> I think it would be really cool to have a function that gives the
> total number of bytes that a data structure consumes.
>
> so something like:
>
> (total-memory [1 2 [1 2]])
>
> would return however many bytes this structure is.
>
> Is there already something like this around? Would it be hard to write
> to work for any clojure data structure?

I've used the following serialization trick during development in a java
context:

(defn total-memory [obj]
(let [baos (java.io.ByteArrayOutputStream.)]
(with-open [oos (java.io.ObjectOutputStream. baos)]
(.writeObject oos obj))
(count (.toByteArray baos))))

It is *very* inaccurate but gives some indication of size.

Regards,
Remco

Mikhail Kryshen

unread,
Dec 23, 2010, 7:26:20 AM12/23/10
to clo...@googlegroups.com
If you want to know how your program uses memory, try using some Java
profiling tool like VisualVM shipped with JDK.

--
Mikhail

Dale Thatcher

unread,
Dec 23, 2010, 1:49:40 AM12/23/10
to Clojure
I've used the Instrumentation debugging interface to find the size of
an object in the past:

http://www.javamex.com/tutorials/memory/instrumentation.shtml

You'd need to walk the object tree and sum up the results avoiding
cycles.

many thanks,

- Dale Thatcher

Daniel Janus

unread,
Dec 23, 2010, 5:43:55 PM12/23/10
to Clojure
On 23 Gru, 05:51, Robert McIntyre <r...@mit.edu> wrote:

> I think it would be really cool to have a function that gives the
> total number of bytes that a data structure consumes.

Here is a tiny utility I wrote some time ago; it's not very accurate,
but
might come in handy:

https://gist.github.com/417669

best,
Daniel
Reply all
Reply to author
Forward
0 new messages