Dependence of RealDoubleElement on GSL

33 views
Skip to first unread message

Matthias Koeppe

unread,
Oct 12, 2021, 4:45:58 PM10/12/21
to sage-devel
Elements of RealDoubleField (RDF) have some methods that are implemented using GSL.

Would we be able to eliminate this dependency? Some of the functions, like isnan, are available in the standard C library since C99. Do functions like gsl_sf_sin have advantages over using functions from the math library? Others, for example gsl_sf_erf, would be available through scipy.special.

$ git grep gsl src/sage/rings/real_double.pyx

src/sage/rings/real_double.pyx:from sage.libs.gsl.all cimport *
src/sage/rings/real_double.pyx:gsl_set_error_handler_off()
src/sage/rings/real_double.pyx:        return self(gsl_sf_fact(n))
src/sage/rings/real_double.pyx:        if gsl_isnan(self._value):
src/sage/rings/real_double.pyx:        if gsl_isinf(self._value):
src/sage/rings/real_double.pyx:        cdef int isinf = gsl_isinf(self._value)
src/sage/rings/real_double.pyx:        cdef bint isnan = gsl_isnan(self._value)
src/sage/rings/real_double.pyx:        if gsl_isnan(self._value):
src/sage/rings/real_double.pyx:        return gsl_isnan(self._value)
src/sage/rings/real_double.pyx:        return gsl_isinf(self._value) > 0
src/sage/rings/real_double.pyx:        return gsl_isinf(self._value) < 0
src/sage/rings/real_double.pyx:        return gsl_isinf(self._value)
src/sage/rings/real_double.pyx:        return self._new_c(sign * gsl_sf_exp(gsl_sf_log(v) * exponent))
src/sage/rings/real_double.pyx:            return self._new_c(gsl_pow_int(self._value, <int>n))
src/sage/rings/real_double.pyx:        a = self._new_c(gsl_sf_log(self._value) / log_of_base)
src/sage/rings/real_double.pyx:                return self._log_base(gsl_sf_log(float(base)))
src/sage/rings/real_double.pyx:        a = self._new_c(gsl_sf_log(self._value) * M_1_LN2)
src/sage/rings/real_double.pyx:        a = self._new_c(gsl_sf_log(self._value) * M_1_LN10)
src/sage/rings/real_double.pyx:        a = self._new_c(gsl_sf_log(self._value) * M_1_LNPI)
src/sage/rings/real_double.pyx:        a = self._new_c(gsl_sf_exp(self._value))
src/sage/rings/real_double.pyx:        a = self._new_c(gsl_sf_exp(self._value * M_LN2))
src/sage/rings/real_double.pyx:        a = self._new_c(gsl_sf_exp(self._value * M_LN10))
src/sage/rings/real_double.pyx:        return self._new_c(gsl_sf_cos(self._value))
src/sage/rings/real_double.pyx:        return self._new_c(gsl_sf_sin(self._value))
src/sage/rings/real_double.pyx:        return self._new_c(gsl_sf_dilog(self._value))
src/sage/rings/real_double.pyx:        return self._new_c(gsl_sf_angle_restrict_symm(self._value))
src/sage/rings/real_double.pyx:        cos = gsl_sf_cos(self._value)
src/sage/rings/real_double.pyx:        a = self._new_c(gsl_sf_sin(self._value) / cos)
src/sage/rings/real_double.pyx:        a = self._new_c(gsl_sf_hypot(self._value, float(other)))
src/sage/rings/real_double.pyx:        return self._new_c(gsl_ldexp( gsl_sf_exp(self._value) + gsl_sf_exp(-self._value), -1)) # (e^x + e^-x)/2
src/sage/rings/real_double.pyx:        return self._new_c(gsl_ldexp( gsl_sf_expm1(self._value) - gsl_sf_expm1(-self._value), -1)) # (e^x - e^-x)/2
src/sage/rings/real_double.pyx:        return self._new_c(gsl_acosh(self._value))
src/sage/rings/real_double.pyx:        return self._new_c(gsl_asinh(self._value))
src/sage/rings/real_double.pyx:        return self._new_c(gsl_atanh(self._value))
src/sage/rings/real_double.pyx:        return self._new_c(gsl_sf_erf(self._value))
src/sage/rings/real_double.pyx:        a = self._new_c(gsl_sf_gamma(self._value))
src/sage/rings/real_double.pyx:        return self._new_c(gsl_sf_zeta(self._value))
src/sage/rings/real_double.pyx:    if gsl_finite(x):
src/sage/rings/real_double.pyx:    cdef int v = gsl_isinf(x)



Dima Pasechnik

unread,
Oct 12, 2021, 6:08:51 PM10/12/21
to sage-devel
On Tue, Oct 12, 2021 at 9:46 PM Matthias Koeppe <matthia...@gmail.com> wrote:
Elements of RealDoubleField (RDF) have some methods that are implemented using GSL.

Would we be able to eliminate this dependency?

Do you mean, specifically for RDF? (gsl is used in Sage's calculus/, not only in RDF)


 
--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/sage-devel/398b1668-ac65-4208-81a7-7f96c1edd830n%40googlegroups.com.

Matthias Koeppe

unread,
Oct 12, 2021, 6:17:54 PM10/12/21
to sage-devel
On Tuesday, October 12, 2021 at 3:08:51 PM UTC-7 Dima Pasechnik wrote:
On Tue, Oct 12, 2021 at 9:46 PM Matthias Koeppe <matthia...@gmail.com> wrote:
Elements of RealDoubleField (RDF) have some methods that are implemented using GSL.

Would we be able to eliminate this dependency?

Do you mean, specifically for RDF? (gsl is used in Sage's calculus/, not only in RDF)


Yes, specifically for RDF; and possibly also for CDF.






 

François Bissey

unread,
Oct 12, 2021, 6:29:33 PM10/12/21
to sage-...@googlegroups.com
After looking a bit at gsl’s doc I don’t think there is any advantage to using them if we are not using the error handling and reporting of gsl (by that I mean error estimates on the results). The only interesting detail is, quoting, “consistency across platforms”. If we are not doing high precision, I don’t think we should care too much on the kind of differences we should expect from C libraries across platforms.

William Stein

unread,
Oct 12, 2021, 7:41:56 PM10/12/21
to sage-devel
Hi,

In addition to numerical correctness considerations, it is important
to also consider the impact on **performance**.

> Would we be able to eliminate this dependency? Some of the functions, like isnan, are available in the standard C library since C99. Do functions like gsl_sf_sin have advantages over using functions from the math library? Others, for example gsl_sf_erf, would be available through scipy.special.

Here is a microbenchmark to illustrate my concern, though of course
you may get different results on a different computer. It's always
possible I made a dumb mistakes (e.g., with real literals, etc.) in
doing this microbenchmark:

# computing erf using GSL: 123ns
sage: a = RDF(2.5)
sage: %timeit a.erf()
123 ns ± 4.33 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

# computing erf using scipy.special.erf: 291ns
sage: from scipy.special import erf
sage: b = float(2.5)
sage: %timeit erf(b)
291 ns ± 2.11 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

# naively using scipy.special.erf to compute erf for an RDF: 2470ns
sage: %timeit RDF(erf(a))
2.47 µs ± 35.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

So if somebody rewrites RDF's erf method to convert the RDF to a
float, do "Others, for example gsl_sf_erf, would be available through
scipy.special.", and convert back, it would look functionally
identical (except maybe for rounding/stability issues), but it would
be 2470/123 ~ 20 times slower. Maybe there is a way to do this more
efficiently via Cython. However, no matter what, you're not going to
get below 291ns that scipy.special.erf, so even in the best case,
switching to using scipy means 291/123 ~ 2.4 times slower...

Here's a benchmark in the "other direction", in which the math library
in Python is faster than RDF:

sage: %timeit a.sin()
99.9 ns ± 3.67 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
sage: b = float(2.5)
sage: from math import sin
sage: %timeit sin(b)
39.5 ns ± 0.0463 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
> --
> You received this message because you are subscribed to the Google Groups "sage-devel" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sage-devel/B77FB2A7-480A-4CFC-870E-3CBD05885540%40gmail.com.



--
William (http://wstein.org)

Dima Pasechnik

unread,
Oct 13, 2021, 4:37:33 AM10/13/21
to sage-devel
On Wed, Oct 13, 2021 at 12:41 AM William Stein <wst...@gmail.com> wrote:
Hi,

In addition to numerical correctness considerations, it is important
to also consider the impact on **performance**.

> Would we be able to eliminate this dependency? Some of the functions, like isnan, are available in the standard C library since C99. Do functions like gsl_sf_sin have advantages over using functions from the math library? Others, for example gsl_sf_erf, would be available through scipy.special.

Here is a microbenchmark to illustrate my concern, though of course
you may get different results on a different computer.   It's always
possible I made a dumb mistakes (e.g., with real literals, etc.) in
doing this microbenchmark:

TL; DR

from math import erf as fastest_erf # ;-)
fastest_erf(b) # the the best


# computing erf using GSL: 123ns
sage: a = RDF(2.5)
sage: %timeit a.erf()
123 ns ± 4.33 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

# computing erf using scipy.special.erf:  291ns
sage: from scipy.special import erf
sage: b = float(2.5)
sage: %timeit erf(b)
291 ns ± 2.11 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

# naively using scipy.special.erf to compute erf for an RDF:  2470ns
sage: %timeit RDF(erf(a))
2.47 µs ± 35.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

So if somebody rewrites RDF's erf method to convert the RDF to a
float, do "Others, for example gsl_sf_erf, would be available through
scipy.special.", and convert back, it would look functionally
identical (except maybe for rounding/stability issues), but it would
be 2470/123 ~ 20 times slower.   Maybe there is a way to do this more
efficiently via Cython.  However, no matter what, you're not going to
get below 291ns that scipy.special.erf, so even in the best case,
switching to using scipy means 291/123 ~ 2.4 times slower...

scipy.special.erf is a NumPy ufunc, with associated extra overhead (e.g. one can call it
directly on iterables, arrays, etc)

Nowadays you can call "naked"  libm.erf, in several ways; on a modern Linux I can just do

sage: from math import erf as merf
sage: merf(b)
0.999593047982555
sage: %timeit merf(b)
77.5 ns ± 0.019 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

and this beats gsl:
sage: %timeit a.erf()
177 ns ± 0.0177 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)


You can also load other dylibs (in this case it's the same libm as used by Python'w math)

sage: from ctypes import *
sage: libm = CDLL("libm.so.6")
sage: myerf=libm.erf
sage: myerf.restype=c_double
sage: ca=c_double(a)
sage: myerf(ca)
0.999593047982555
sage: %timeit myerf(ca)
277 ns ± 3.07 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
sage: %timeit a.erf()
177 ns ± 0.0177 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
sage: %timeit erf(b)
519 ns ± 0.448 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

So libm.erf is somewhat slower than gsl's, but not as much as scipy's.
Seems that there is more overhead in CDLL machinery than in `merf`, for some reason.

 

Here's a benchmark in the "other direction", in which the math library
in Python is faster than RDF:

sage: %timeit a.sin()
99.9 ns ± 3.67 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
sage: b = float(2.5)
sage: from math import sin
sage: %timeit sin(b)
39.5 ns ± 0.0463 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

On Tue, Oct 12, 2021 at 3:29 PM François Bissey <frp.b...@gmail.com> wrote:
>
> After looking a bit at gsl’s doc I don’t think there is any advantage to using them if we are not using the error handling and reporting of gsl (by that I mean error estimates on the results). The only interesting detail is, quoting, “consistency across platforms”. If we are not doing high precision, I don’t think we should care too much on the kind of differences we should expect from C libraries across platforms.
>
> > On 13/10/2021, at 09:45, Matthias Koeppe <matthia...@gmail.com> wrote:
> >
> > Elements of RealDoubleField (RDF) have some methods that are implemented using GSL.
> >
> > Would we be able to eliminate this dependency? Some of the functions, like isnan, are available in the standard C library since C99. Do functions like gsl_sf_sin have advantages over using functions from the math library? Others, for example gsl_sf_erf, would be available through scipy.special.
> >
>
> --
> You received this message because you are subscribed to the Google Groups "sage-devel" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/sage-devel/B77FB2A7-480A-4CFC-870E-3CBD05885540%40gmail.com.



--
William (http://wstein.org)

--
You received this message because you are subscribed to the Google Groups "sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+...@googlegroups.com.

Matthias Koeppe

unread,
Oct 13, 2021, 12:29:31 PM10/13/21
to sage-devel
William, Dima, thanks for these first microbenchmarks.
I agree, of course, that consistency, speed and precision are all important.

It sounds like keeping GSL as an option may still be desirable. 

It may be possible to create a common base class RealDoubleElement_base and have two implementation classes in separate Cython modules -- one using GSL, the other one using only standard library and (if needed) scipy. (The one using GSL would be provided by a separate distribution that also provides other facilities of Sage that need GSL.) If the module using GSL cannot be imported, Sage would use the other one.

I have opened https://trac.sagemath.org/ticket/32677 for work in this direction. I think I'll need help by Cython experts for this.

Reply all
Reply to author
Forward
0 new messages