libSingular much slower than Singular via pexpect??

Simon King

unread,

Oct 27, 2011, 4:01:39 AM10/27/11

to sage-devel

Hi!

I just found the following:
sage: P.<x,y,z> = QQ[]
sage: def test1(x,y,z):
....: for i in range(100):
....: p = (x+y+z)^i
....:
sage: def test2(x,y,z):
....: x = singular(x)
....: y = singular(y)
....: z = singular(z)
....: for i in range(100):
....: p = (x+y+z)^i
....:
sage: %time test1(x,y,z)
CPU times: user 2.66 s, sys: 0.00 s, total: 2.66 s
Wall time: 2.67 s
sage: %time test2(x,y,z)
CPU times: user 0.15 s, sys: 0.08 s, total: 0.23 s
Wall time: 1.94 s

The timings were obtained with sage-4.6.2.

So, replacing libSingular with Singular-via-pexpect saves much of the
computation time.

Is that something to worry about?

Best regards,
Simon

Martin Albrecht

unread,

Oct 27, 2011, 5:39:41 AM10/27/11

to sage-...@googlegroups.com

Yes!

I can confirm this behaviour. The problem doesn't seem to be restricted to
__pow__, i.e. replacing ^i by products also gives slower results when using
using libSingular. However, mod p it seems libSingular wins. Hence my guess:
we are using different memory allocation functions for GMP and these functions
might be slower (for Singular's use case)?

Cheers,
Martin

--
name: Martin Albrecht
_pgp: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x8EF0DC99
_otr: 47F43D1A 5D68C36F 468BAEBA 640E8856 D7951CCF
_www: http://martinralbrecht.wordpress.com/
_jab: martinr...@jabber.ccc.de

Simon King

unread,

Oct 27, 2011, 6:11:42 AM10/27/11

to sage-devel

Hi Martin,

On 27 Okt., 11:39, Martin Albrecht <martinralbre...@googlemail.com>
wrote:

> On Thursday 27 October 2011, Simon King wrote:
> > Is that something to worry about?
>
> Yes!

Good. I created trac ticket #11957.

> Hence my guess:
> we are using different memory allocation functions for GMP and these functions
> might be slower (for Singular's use case)?

Isn't Singular using a fairly uncommon memory management? If I
remember correctly what Hans presented at the last Singular Sage Days
in Kaiserslautern, their memory management is a lot faster for what
Singular needs.

Best regards,
Simon

Burcin Erocal

unread,

Oct 27, 2011, 6:27:20 AM10/27/11

to sage-...@googlegroups.com

Hi Simon,

Singular uses omalloc for memory management. This library was developed
by Olaf Bachmann specifically for Singular's use case. This blog post
has some slides about it:

http://singular-team.blogspot.com/2011/09/sicsa2011.html

I don't think we can use omalloc in Sage since it is not thread safe.
Using GMP with two different memory allocators is not an option either.
So I am not sure if this problem can be solved easily.

On the other hand, any help to make omalloc thread safe would be much
appreciated. :) Interested parties should contact
libsingular-devel@googlegroups.

Cheers,
Burcin

Martin Albrecht

unread,

Oct 27, 2011, 6:38:38 AM10/27/11

to sage-...@googlegroups.com

On Thursday 27 October 2011, Simon King wrote:
> Hi Martin,
>
> On 27 Okt., 11:39, Martin Albrecht <martinralbre...@googlemail.com>
>
> wrote:
> > On Thursday 27 October 2011, Simon King wrote:
> > > Is that something to worry about?
> >
> > Yes!
>
> Good. I created trac ticket #11957.
>
> > Hence my guess:
> > we are using different memory allocation functions for GMP and these
> > functions might be slower (for Singular's use case)?
>
> Isn't Singular using a fairly uncommon memory management? If I
> remember correctly what Hans presented at the last Singular Sage Days
> in Kaiserslautern, their memory management is a lot faster for what
> Singular needs.

Yes, and the observed slowdown might be related to us not using it for GMP
integers.

Simon King

unread,

Oct 27, 2011, 8:56:10 AM10/27/11

to sage-devel

Hi Martin, hi Burcin,

On 27 Okt., 12:38, Martin Albrecht <martinralbre...@googlemail.com>
wrote:

> > Isn't Singular using a fairly uncommon memory management? If I
> > remember correctly what Hans presented at the last Singular Sage Days
> > in Kaiserslautern, their memory management is a lot faster for what
> > Singular needs.
>
> Yes, and the observed slowdown might be related to us not using it for GMP
> integers.

OK, but that probably means we can hardly fix the regression - for the
reasons (thread safeness) that Burcin mentioned.

Cheers,
Simon

Volker Braun

unread,

Oct 27, 2011, 10:34:01 AM10/27/11

to sage-...@googlegroups.com

I don't understand how thread safety in libSingular is an issue, Sage runs in a single thread after all. I thought the explanation is that Singular uses a slightly wonky optimization for small integer coefficients, roughly storing it in the place that would usually hold the pointer to the GMP struct. That of course helps a lot with cache locality, but also would certainly lead to a lot of tears if Sage were to internally use different GMP representations in different places.

Martin Albrecht

unread,

Oct 27, 2011, 10:59:21 AM10/27/11

to sage-...@googlegroups.com

This doesn't make a difference for the case discussed here: libSingular uses
the same optimisation as Singular via pexpect, i.e. we translate between GMP
integers and the small in-place integers when getting/setting coefficients. So
the only difference I can think of is the memory manager. Back in the days of
Sage 1.4 (!) I compared various malloc replacements:

http://wiki.sagemath.org/MallocReplacements

Volker Braun

unread,

Oct 27, 2011, 1:26:51 PM10/27/11

to sage-...@googlegroups.com

Your test2() function leave the polynomial data inside Singular. Of course its faster to not pull the (pretty big) coefficients out of Singular. Wouldn't the following be a much better comparison?

def test3(x,y,z):

x = singular(x)

y = singular(y)

z = singular(z)

for i in range(100):

p = (x+y+z)^i

p.sage()

The latter is awfully slow. In fact it still hasn't finished by the time I wrote this post so forget about it :-)

Martin Albrecht

unread,

Oct 27, 2011, 1:38:34 PM10/27/11

to sage-...@googlegroups.com

On Thursday 27 October 2011, Volker Braun wrote:

Sure, but it's still worrisome that a computation is slower in libSingular
than in Singular. If it's down to the different memory manager, then there
isn't much we (can/will) do about, in which case it's at least good to know:
e.g. computing a GB using pexpect might be faster if there is only one common
root for example.

Volker Braun

unread,

Oct 27, 2011, 1:57:18 PM10/27/11

to sage-...@googlegroups.com

On Thursday, October 27, 2011 1:38:34 PM UTC-4, Martin Albrecht wrote:

Sure, but it's still worrisome that a computation is slower in libSingular
than in Singular.

Just to reiterate in case I wasn't clear, as long as the "computation in Singular" just references a singular variable as output and "libsingular" converts the output to Sage datastructures the latter is always going to be slower. A good benchmark would be e.g. computing the dimension of some idea where there is not much output.

Also, timing finally finished:

sage: timeit('test1(*P.gens())')

5 loops, best of 3: 2.2 s per loop

sage: timeit('test2(*P.gens())')

5 loops, best of 3: 1.49 s per loop

sage: timeit('test3(*P.gens())')

5 loops, best of 3: 32.3 s per loop

William Stein

unread,

Oct 28, 2011, 1:06:41 AM10/28/11

to sage-...@googlegroups.com

On Thu, Oct 27, 2011 at 12:57 PM, Volker Braun <vbrau...@gmail.com> wrote:
> On Thursday, October 27, 2011 1:38:34 PM UTC-4, Martin Albrecht wrote:
>>
>> Sure, but it's still worrisome that a computation is slower in libSingular
>> than in Singular.
>
> Just to reiterate in case I wasn't clear, as long as the "computation in
> Singular" just references a singular variable as output and "libsingular"
> converts the output to Sage datastructures the latter is always going to be
> slower.

That is not what is happening here. libsingular is definitely not
converting the output to Sage datastructures. The Sage polynomials
x,y,z are simply Cython level wrappers around a pure Singular
datastructure.

sage: type(x)
<type 'sage.rings.polynomial.multi_polynomial_libsingular.MPolynomial_libsingular'>

I'm pretty surprised by this benchmark.

The timing difference on my computer is even more:

sage: time test1(x,y,z)
Time: CPU 4.89 s, Wall: 4.89 s
sage: time test2(x,y,z)
Time: CPU 0.15 s, Wall: 1.73 s

It really comes down to this (following exactly the code that Simon posted):

sage: time v = (x+y+z)^100
Time: CPU 0.23 s, Wall: 0.23 s
sage: xx=singular(x);yy=singular(y);zz=singular(z)
sage: time v = (xx+yy+zz)^100
Time: CPU 0.00 s, Wall: 0.06 s

Why is exponentiation dramatically faster in case 2 than 1? Note
that the *wall time* is what matters in both cases, by the way.

-- William

> A good benchmark would be e.g. computing the dimension of some idea
> where there is not much output.
> Also, timing finally finished:
> sage: timeit('test1(*P.gens())')
> 5 loops, best of 3: 2.2 s per loop
> sage: timeit('test2(*P.gens())')
> 5 loops, best of 3: 1.49 s per loop
> sage: timeit('test3(*P.gens())')
> 5 loops, best of 3: 32.3 s per loop
>

> --
> To post to this group, send an email to sage-...@googlegroups.com
> To unsubscribe from this group, send an email to
> sage-devel+...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/sage-devel
> URL: http://www.sagemath.org
>

--
William Stein
Professor of Mathematics
University of Washington
http://wstein.org

Martin Albrecht

unread,

Oct 28, 2011, 4:24:12 AM10/28/11

to sage-...@googlegroups.com

I think we should compare which function the Singular interpreter calls and
which we call and also see if we get the same performance if we switch to
OMalloc (I'm not suggesting to switch to OMalloc, just for testing)

Nils Bruin

unread,

Oct 28, 2011, 1:33:58 PM10/28/11

to sage-devel

On Oct 27, 3:27 am, Burcin Erocal <bur...@erocal.org> wrote:

> I don't think we can use omalloc in Sage since it is not thread safe.
> Using GMP with two different memory allocators is not an option either.
> So I am not sure if this problem can be solved easily.

A similar issue arises in ECL. By default, it plugs Boehm's malloc
replacements into GMP. This of course doesn't work at all with Sage.
As it turns out, the setup for arithmetic in ECL meant that most
arithmetic gets stored into a "register" location first and gets
copied into appropriate lisp data structures afterward. It is not a
big disaster when the registers are not under Boehm's control, so ECL
now has an option that allows it to run with GMP using a memory
manager different from the one used by ECL. ECL only reluctantly
reduces the size of these registers, so usually GMP calls result in no
reallocation calls whatsoever. Only once the result is known is the
lisp datastructure allocated and the GMP integer is copied into it
(yes, the bitstrings representing GMP integers are location
independent)

If the size of the answer can be predicted beforehand, the use of a
register can even be factored out: You just allocate enough space and
call GMP with a destination pointing to that space.

If the size of the answer is not known, then it is likely a good idea
to first compute the result into a preallocated scratch register
anyway and then allocate and copy, because saving a reallocate likely
compensates a memcpy (especially because the scratch register will
quickly gain a permanent spot in the cache!)

So, depending on how libSingular's use of GMP is organized, it might
be doable to let libSingular use omalloc and call GMP either with
destinations that are guaranteed to not need reallocation or only with
destinations that function as "registers" and can be managed by
whatever allocator GMP uses (and preferably are only reallocated
rarely).

If omalloc's use is limited to libSingular, its thread-unsafeness is
likely much less of an issue.

Another point is of course the number of heaps that a typical sage
session is starting to depend on! (PyMem_malloc, malloc,
Boehm_malloc, omalloc, pari/gp's stack ...)

Reply all

Reply to author

Forward