Garbage collection issue with DOUBLE-FLOAT vs. SINGLE-FLOAT

Daniel Borchmann

unread,

Jan 16, 2012, 1:28:36 AM1/16/12

to

Dear cll,

welcome to my very first post on this list, in which I'd like to ask for
your assistance.

I've recently participated in a seminar assignment whose main task was
to implement Gauß' algorithm for solving systems of linear equations
(i.e. LU decomposition). I did that here

http://daniel.kxpq.de/pub/seminar.lisp

(sorry, the MIME type is not set correctly) and wanted to try it out on
some data sets, i.e.

http://daniel.kxpq.de/pub/testdata
http://daniel.kxpq.de/pub/testdata-double

The first one contains only SINGLE-FLOAT numbers and works quite well.
The second one, however, containes only DOUBLE-FLOAT numbers and makes
the garbage collector work like there is no tomorrow. See also those
profiling results

http://daniel.kxpq.de/pub/profile-double-cpu
http://daniel.kxpq.de/pub/profile-double-alloc

(again sorry for the not-set MIME types...)

Can anyone explain this to me? Why cause DOUBLE-FLOATs so much memory
consumption? I'm afraid this is caused by some intermediate results,
maybe due to COERCE, but I cannot see where this happens...

Thanks,
Daniel

--
Daniel Borchmann http://daniel.kxpq.de
GPG (Mail) 04A9 66A1 A17C C91E 39B5 8722 69B0 E96D A109 8F58

Frode V. Fjeld

unread,

Jan 16, 2012, 1:49:50 AM1/16/12

to

Daniel Borchmann <daniel.b...@mailbox.tu-dresden.de> writes:

> Why cause DOUBLE-FLOATs so much memory consumption?

I'm guessing your lisp implementation has immediate single-floats and
heap-allocated double-floats. That would mean that almost every
(intermediate) number in the double-float version would have to be
allocated in memory (i.e. "consed") and subsequently handled by GC,
while none of this would happen with single-floats.

There are I believe two options wrt. avoiding the consing of
double-floats: Don't use double-floats, or try to restructure and add
type-declarations to your program such that the compiler can manage to
handle the double-floats without consing them. This latter option is
somewhat implementation-specific and most likely difficult.

--
Frode V. Fjeld

Daniel Borchmann

unread,

Jan 16, 2012, 2:17:21 AM1/16/12

to

"Frode V. Fjeld" <fro...@gmail.com> writes:

> Daniel Borchmann <daniel.b...@mailbox.tu-dresden.de> writes:
>
>> Why cause DOUBLE-FLOATs so much memory consumption?
>
> I'm guessing your lisp implementation has immediate single-floats and
> heap-allocated double-floats.

How can I figure this out? I've used SBCL 1.0.55.

RG

unread,

Jan 16, 2012, 2:41:42 AM1/16/12

to

In article <ypo3obu4...@cantor.inf.tu-dresden.de>,

Daniel Borchmann <daniel.b...@mailbox.tu-dresden.de> wrote:

> "Frode V. Fjeld" <fro...@gmail.com> writes:
>
> > Daniel Borchmann <daniel.b...@mailbox.tu-dresden.de> writes:
> >
> >> Why cause DOUBLE-FLOATs so much memory consumption?
> >
> > I'm guessing your lisp implementation has immediate single-floats and
> > heap-allocated double-floats.
> How can I figure this out? I've used SBCL 1.0.55.

You can use TIME to profile the amount of memory used by a piece of
code, but it's a little tricky because you have to make sure the
compiler doesn't optimize too much, and the SBCL optimizer is pretty
good. So you have to do something like this:

* (time (dotimes (i 10000) (setf x (coerce i 'single-float))))

Evaluation took:
0.0 seconds of real time
1.03e-4 seconds of user run time
1.e-6 seconds of system run time
0 calls to %EVAL
0 page faults and
81,920 bytes consed.
NIL

* (time (dotimes (i 10000) (setf x (coerce i 'double-float))))

Evaluation took:
0.0 seconds of real time
1.25e-4 seconds of user run time
1.e-6 seconds of system run time
0 calls to %EVAL
0 page faults and
155,648 bytes consed.
NIL

Apparently both single and double floats are heap allocated, but doubles
are twice as big.

rg

Frode V. Fjeld

unread,

Jan 16, 2012, 2:46:18 AM1/16/12

to

"Frode V. Fjeld" <fro...@gmail.com> writes:

>> I'm guessing your lisp implementation has immediate single-floats and
>> heap-allocated double-floats.

Daniel Borchmann <daniel.b...@mailbox.tu-dresden.de> writes:

> How can I figure this out? I've used SBCL 1.0.55.

It often also depends on the machine architecture (64 or 32 bit?)
too. This information should be somewhere in the lisp's
documentation. Or ask someone who'd know.

Also, use the TIME operator (or some other memory profiler) to see the
amount of consing going on.

--
Frode V. Fjeld

Nicolas Neuss

unread,

Jan 16, 2012, 6:40:11 AM1/16/12

to

Daniel Borchmann <daniel.b...@mailbox.tu-dresden.de> writes:

> Dear cll,
>
> welcome to my very first post on this list, in which I'd like to ask for
> your assistance.
>
> I've recently participated in a seminar assignment whose main task was
> to implement Gauß' algorithm for solving systems of linear equations
> (i.e. LU decomposition). I did that here
>
> http://daniel.kxpq.de/pub/seminar.lisp

If you want to have SBCL generate fast code, you will have to declare
the array more prcisely, e.g. something like
(simple-array double-float (*)) (array double-float *)

> (sorry, the MIME type is not set correctly) and wanted to try it out on
> some data sets, i.e.
>
> http://daniel.kxpq.de/pub/testdata
> http://daniel.kxpq.de/pub/testdata-double

It would be more comfortable for you and us if you use something like

(defun test-gauss (size element-type)
(let ((mat (random-matrix size size element-type)))
(gauss mat)))

If you would provide a simple test case in this way, I would take a more
careful look at the code. If optimised correctly, I would expect at
most a factor 2 difference in speed between single-float and
double-float.

Nicolas

Daniel Borchmann

unread,

Jan 16, 2012, 8:33:45 AM1/16/12

to

Nicolas Neuss <last...@scipolis.de> writes:

> Daniel Borchmann <daniel.b...@mailbox.tu-dresden.de> writes:

> If you want to have SBCL generate fast code, you will have to declare
> the array more prcisely, e.g. something like
> (simple-array double-float (*)) (array double-float *)

But this implies that GAUSS cannot be used for other arrays then, does
it? I'm especially interested in FIXNUMs, SINGLE-FLOAT, DOUBLE-FLOAT
and maybe intervals of various number formats.

>> (sorry, the MIME type is not set correctly) and wanted to try it out on
>> some data sets, i.e.
>>
>> http://daniel.kxpq.de/pub/testdata
>> http://daniel.kxpq.de/pub/testdata-double
>
> It would be more comfortable for you and us if you use something like
>
> (defun test-gauss (size element-type)
> (let ((mat (random-matrix size size element-type)))
> (gauss mat)))
>

Indeed, such a functions exists (TEST-GAUSS, at the vary bottom of the
file) and takes as parameter the file containing the data, i.e. you can
do

(test-gauss "testdata")

Therein, TIME is called for the call to GAUSS.

> If you would provide a simple test case in this way, I would take a more
> careful look at the code. If optimised correctly, I would expect at
> most a factor 2 difference in speed between single-float and
> double-float.

Yes, me too.

Thanks!

Daniel Borchmann

unread,

Jan 16, 2012, 8:36:46 AM1/16/12

to

"Frode V. Fjeld" <fro...@gmail.com> writes:

> "Frode V. Fjeld" <fro...@gmail.com> writes:
>
>>> I'm guessing your lisp implementation has immediate single-floats and
>>> heap-allocated double-floats.
>
> Daniel Borchmann <daniel.b...@mailbox.tu-dresden.de> writes:
>
>> How can I figure this out? I've used SBCL 1.0.55.
>
> It often also depends on the machine architecture (64 or 32 bit?)
> too. This information should be somewhere in the lisp's
> documentation. Or ask someone who'd know.

It's 64 bit, I'm trying the code on a 32bit when I find one.

> Also, use the TIME operator (or some other memory profiler) to see the
> amount of consing going on.

I already did that, and also used SBCL statistical profiler. However, I
couldn't see from the output where consing is going on...

Thanks!
Daniel

Pascal J. Bourguignon

unread,

Jan 16, 2012, 10:14:46 AM1/16/12

to

Daniel Borchmann <daniel.b...@mailbox.tu-dresden.de> writes:

> "Frode V. Fjeld" <fro...@gmail.com> writes:
>
>> Daniel Borchmann <daniel.b...@mailbox.tu-dresden.de> writes:
>>
>>> Why cause DOUBLE-FLOATs so much memory consumption?
>>
>> I'm guessing your lisp implementation has immediate single-floats and
>> heap-allocated double-floats.
> How can I figure this out? I've used SBCL 1.0.55.

Have a look at the references in the second section of:
http://cliki.net/Performance

Use DISASSEMBLE.

--
__Pascal Bourguignon__ http://www.informatimago.com/
A bad day in () is better than a good day in {}.

Tim Bradshaw

unread,

Jan 16, 2012, 2:53:31 PM1/16/12

to

On 2012-01-16 13:33:45 +0000, Daniel Borchmann said:

> But this implies that GAUSS cannot be used for other arrays then, does
> it? I'm especially interested in FIXNUMs, SINGLE-FLOAT, DOUBLE-FLOAT
> and maybe intervals of various number formats.

The problem here is that the compiler needs to generate quite different
code for the single & double float cases, so it's unlikely a generic
(using the term, erm, generically) function can do this.

However, you can do some really neat tricks here in Lisp due to the
wonders of macros: write your algorithm in terms of a (local) macro
which takes a type argument and expands to something like

(locally (declare (type type ...) ...) ...)

Then you can write, say

(typecase x
(fixnum (xbody fixnum ...))
(single-float (xbody single-float).
...
(t (xbody t))

The point being that the macro expands into identical code, but wrapped
in suitable type declarations to let the compiler do a good job of
optimizing it for the types concerned. And because the code is only
defined once, the source is not verbose. You can do similar tricks for
methods and so on if you want.

If your algorithm takes a long time but the code is small, you can go
even further: write the function at runtime with types you know *at
that point*, compile on the fly and run the compiled function, so you
write a macro which expands into something like

(funcall (compile nil (lambda (...) (declare ...) ...)) ...)

--tim

Nicolas Neuss

unread,

Jan 16, 2012, 3:31:39 PM1/16/12

to

Daniel Borchmann <daniel.b...@mailbox.tu-dresden.de> writes:

> Nicolas Neuss <last...@scipolis.de> writes:
>
>> Daniel Borchmann <daniel.b...@mailbox.tu-dresden.de> writes:
>> If you want to have SBCL generate fast code, you will have to declare
>> the array more prcisely, e.g. something like
>> (simple-array double-float (*)) (array double-float *)
> But this implies that GAUSS cannot be used for other arrays then, does
> it? I'm especially interested in FIXNUMs, SINGLE-FLOAT, DOUBLE-FLOAT
> and maybe intervals of various number formats.

It can if you introduce macros and compilation at runtime in a suitable
way. There are several numerical codes which do this sort of thing,
e.g. my own code Femlisp[*] which generates and compiles code for some
BLAS methods on demand.

Nicolas

[*] Femlisp (www.femlisp.org) is surely not the easiest example and not
the most beautiful code, and, unfortunately, I also did not work much on
its public CVS repository in the recent years. But it achieves
reasonable speed compared with C, and the code there for defining a dot
product and addition on rather general matrices (single-float,
double-float, complex double-float, ...) looks as follows:

(define-blas-template dot ((x standard-matrix) (y standard-matrix))
(let ((sum (coerce 0 'element-type)))
(declare (type element-type sum))
(vec-for-each-entry (x y)
(element-m+! (element-m* (ref x) (ref y)) sum))
sum))

(define-blas-template m+! ((x standard-matrix) (y standard-matrix))
(vec-for-each-entry (x y) (element-m+! (ref x) (ref y)))
y)

I think I would write that code differently nowadays, probably by
calling a local (memoized) function depending on types, but you
(hopefully) can see that a lot is possible in CL in principle.

David Brown

unread,

Jan 16, 2012, 8:03:57 PM1/16/12

to

On a 32-bit host, yes. On a 64-bit platform, the single-float example says

0 bytes consed

David

Raffael Cavallaro

unread,

Jan 17, 2012, 2:13:36 PM1/17/12

to

On 2012-01-16 06:49:50 +0000, Frode V. Fjeld said:

> There are I believe two options wrt. avoiding the consing of
> double-floats: Don't use double-floats, or try to restructure and add
> type-declarations to your program such that the compiler can manage to
> handle the double-floats without consing them. This latter option is
> somewhat implementation-specific and most likely difficult.

FWIW, I use a set of double float macros that can convince some
implementations not to box double floats and declares references to
float arrays as (simple-array yada-float (*)) . I have them for single
floats as well, but they're less necessary in 64-bit implementations
these are often immediate anyway.

If the OP rewrites his code to use df+, df*, dfaref, etc. he may find
that it conses less.

------------ begin aref-macros.lisp ------------------------

(eval-when (:compile-toplevel :load-toplevel :execute)
(defmacro define-df-macro (function-name)
`(let* ((fsym ',function-name))
(defmacro ,(intern (string-upcase
(concatenate 'string "df"
(symbol-name function-name))))
(arg1 arg2)
`(the double-float (,fsym (the double-float ,arg1)
(the double-float ,arg2)))))))

(eval-when (:compile-toplevel :load-toplevel :execute)
(defmacro define-df-boolean-macro (function-name)
`(let* ((fsym ',function-name))
(defmacro ,(intern (string-upcase
(concatenate 'string "df"
(symbol-name function-name))))
(arg1 arg2)
`(the boolean (,fsym (the double-float ,arg1)
(the double-float ,arg2)))))))

(eval-when (:compile-toplevel :load-toplevel :execute)

(defmacro define-df-macros (&rest names)
`(progn
,@(mapcar (lambda (name) `(define-df-macro ,name))
names)))

(defmacro define-df-boolean-macros (&rest names)
`(progn
,@(mapcar (lambda (name) `(define-df-boolean-macro ,name))
names))))

(eval-when (:compile-toplevel :load-toplevel :execute)

(define-df-macros + - * / max min)

(define-df-boolean-macros < <= > >= =)

(defmacro dfaref (array index)
`(the double-float (aref (the (simple-array double-float (*)) ,array)
(the fixnum ,index))))

(defmacro dfif (condition true-branch false-branch)
`(the double-float (if ,condition (the double-float ,true-branch)
(the double-float ,false-branch))))
(defmacro dfsetf (place new-value)
`(the double-float (setf (the double-float ,place)
(the double-float ,new-value)))))

(provide 'df-macros)

(eval-when (:compile-toplevel :load-toplevel :execute)
(defmacro define-sf-macro (function-name)
`(let* ((fsym ',function-name))
(defmacro ,(intern (string-upcase
(concatenate 'string "sf"
(symbol-name function-name))))
(arg1 arg2)
`(the single-float (,fsym (the single-float ,arg1)
(the single-float ,arg2)))))))

(eval-when (:compile-toplevel :load-toplevel :execute)
(defmacro define-sf-boolean-macro (function-name)
`(let* ((fsym ',function-name))
(defmacro ,(intern (string-upcase
(concatenate 'string "sf"
(symbol-name function-name))))
(arg1 arg2)
`(the boolean (,fsym (the single-float ,arg1)
(the single-float ,arg2)))))))

(eval-when (:compile-toplevel :load-toplevel :execute)

(defmacro define-sf-macros (&rest names)
`(progn
,@(mapcar (lambda (name) `(define-sf-macro ,name))
names)))

(defmacro define-sf-boolean-macros (&rest names)
`(progn
,@(mapcar (lambda (name) `(define-sf-boolean-macro ,name))
names))))

(eval-when (:compile-toplevel :load-toplevel :execute)

(define-sf-macros + - * / max min)

(define-sf-boolean-macros < <= > >= =)

(defmacro sfaref (array index)
`(the single-float (aref (the (simple-array single-float (*)) ,array)
(the fixnum ,index))))

(defmacro sfif (condition true-branch false-branch)
`(the single-float (if ,condition (the single-float ,true-branch)
(the single-float ,false-branch))))
(defmacro sfsetf (place new-value)
`(the single-float (setf (the single-float ,place)
(the single-float ,new-value)))))

(provide 'sf-macros)

(defmacro fxaref (array index)
`(the fixnum (aref (the (simple-array fixnum (*)) ,array)
(the fixnum ,index))))

(defmacro fxsetf (place new-value)
`(the fixnum (setf (the fixnum ,place)
(the fixnum ,new-value))))

(provide 'fx-macros)

(provide 'aref-macros)

----------- end aref-macros.lisp -------------------------
--
Raffael Cavallaro

Kaz Kylheku

unread,

Jan 17, 2012, 5:14:54 PM1/17/12

to

On 2012-01-16, Daniel Borchmann <daniel.b...@mailbox.tu-dresden.de> wrote:
> The first one contains only SINGLE-FLOAT numbers and works quite well.
> The second one, however, containes only DOUBLE-FLOAT numbers and makes
> the garbage collector work like there is no tomorrow. See also those
> profiling results

Given that the double-float code conses an object every time you do some
math operation, the results is amazing. Namely that only 21% of the time spent
in GC, wow. Think about it: about 8 cycles of heap-intensive, compiled code
computing only require an additional 2 cycles of cleanup overhead. Kudos to
that SBCL.

Daniel Borchmann

unread,

Jan 17, 2012, 11:56:32 PM1/17/12

to

Tim Bradshaw <t...@tfeb.org> writes:

> On 2012-01-16 13:33:45 +0000, Daniel Borchmann said:
>
>> But this implies that GAUSS cannot be used for other arrays then, does
>> it? I'm especially interested in FIXNUMs, SINGLE-FLOAT, DOUBLE-FLOAT
>> and maybe intervals of various number formats.
>
> The problem here is that the compiler needs to generate quite
> different code for the single & double float cases, so it's unlikely a
> generic (using the term, erm, generically) function can do this.
>
> However, you can do some really neat tricks here in Lisp due to the
> wonders of macros:

I tried this and the result is quite amazing
(http://paste.lisp.org/+2Q3F). Thank you for this hint!

The implementation of the macro TT, however, does not seem that
satisfying to me. I may also extend it to dynamically compile code for
those types that have not been specified a priori.

Best,

Daniel Borchmann

unread,

Jan 17, 2012, 11:58:02 PM1/17/12

to

Nicolas Neuss <last...@scipolis.de> writes:
> [*] Femlisp (www.femlisp.org) is surely not the easiest example and not
> the most beautiful code, and, unfortunately, I also did not work much on
> its public CVS repository in the recent years. But it achieves
> reasonable speed compared with C, and the code there for defining a dot
> product and addition on rather general matrices (single-float,
> double-float, complex double-float, ...) looks as follows:
>
> (define-blas-template dot ((x standard-matrix) (y standard-matrix))
> (let ((sum (coerce 0 'element-type)))
> (declare (type element-type sum))
> (vec-for-each-entry (x y)
> (element-m+! (element-m* (ref x) (ref y)) sum))
> sum))
>
> (define-blas-template m+! ((x standard-matrix) (y standard-matrix))
> (vec-for-each-entry (x y) (element-m+! (ref x) (ref y)))
> y)
>
> I think I would write that code differently nowadays, probably by
> calling a local (memoized) function depending on types, but you
> (hopefully) can see that a lot is possible in CL in principle.

I tried femlisp.org, but only got "It works!". Do you have a direct
link to your code, I'd be very interested in the implementation of
DEFINE-BLAS-TEMPLATE.

Thanks,

Nicolas Neuss

unread,

Jan 18, 2012, 3:12:45 AM1/18/12

to

Daniel Borchmann <daniel.b...@mailbox.tu-dresden.de> writes:

> I tried femlisp.org, but only got "It works!".

I did a OS update for the server, but it should work now.

> Do you have a direct link to your code, I'd be very interested in the
> implementation of DEFINE-BLAS-TEMPLATE.

The repository is hosted at the FSF server devoted to BSD(or
similar)-licensed code, so the easiest way to get it is

cvs -z3 -d:pserver:anon...@cvs.savannah.nongnu.org:/sources/femlisp co femlisp

Documentation is in femlisp/doc and can be generated in various formats.
Please report any failures - at the moment I could invest a little time
for maybe even pushing out a new version.

> Thanks,
> Daniel

Nicolas

Tim Bradshaw

unread,

Jan 18, 2012, 4:32:55 AM1/18/12

to

On 2012-01-18 04:56:32 +0000, Daniel Borchmann said:
>>
> I tried this and the result is quite amazing
> (http://paste.lisp.org/+2Q3F). Thank you for this hint!

It's a good trick I think. My original case for it was where I had
some code which was normally fixnum but could overflow into bignums
(and back into fixnums). it looked, I think, like

(labels ((hairy-fixnum (...)
(declare ... all is fixnums ...)
...
(if (about-to-be-bignum ...)
(hairy-bignum ...)
(hairy-fixnum ...)))
(hairy-bignum (...)
(declare ... all is bignums ...)
... identical code ...))
(hairy-fixnum ...))

where the whole body came from a local macro.