Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

New module (written in C) for using the high-precision QD library

44 views
Skip to first unread message

baru...@gmail.com

unread,
Jul 30, 2015, 4:09:43 PM7/30/15
to
Hi,

I wrote a module for wrapping the well-known high-precision QD library written by D.H. Bailey.
You can find it here: https://github.com/baruchel/qd

It is written in pure C with the CPython C-API in order to get the highest possible speed.

The QD library provides floating number types for ~32 and ~64 decimals digits of precision and should be quicker for such precisions than other arbitrary-precision libraries.

My ultimate goal is to implement these two types as new dtype for Numpy arrays, but I release here a first version which already gives a complete interface to the QD library.

Regards,

--
Thomas Baruchel

Stefan Behnel

unread,
Jul 31, 2015, 3:27:09 AM7/31/15
to pytho...@python.org
baru...@gmail.com schrieb am 30.07.2015 um 22:09:
> It is written in pure C with the CPython C-API in order to get the highest possible speed.

This is a common fallacy. Cython should still be able to squeeze another
bit of performance out of your wrapper for you. It tends to know the C-API
better than you would think, and it does things for you that you would
never do in C. It also helps in keeping your code safer and easier to maintain.

Your C code seems to be only about 1500 lines, not too late to translate
it. That should save you a couple of hundred lines and at the same time
make it work with Python 3 (which it currently doesn't, from what I see).

Stefan


Chris Angelico

unread,
Jul 31, 2015, 3:37:36 AM7/31/15
to pytho...@python.org
On Fri, Jul 31, 2015 at 5:26 PM, Stefan Behnel <stef...@behnel.de> wrote:
> Your C code seems to be only about 1500 lines, not too late to translate
> it. That should save you a couple of hundred lines and at the same time
> make it work with Python 3 (which it currently doesn't, from what I see).

I was just looking over the README (literally two minutes ago, your
message came in as I was wording up a reply), and Python 3 support
does seem to be a bit of a hole in the support.

To what extent does Cython make this easier? The biggest barrier I
would expect to see is the bytes/text distinction, where a default
quoted string has different meaning in the two versions - but like
with performance guessing, this is much more likely to be wrong than
right.

Another, but much smaller, hole in the support would be installation
via pip. I'd recommend getting the package listed on PyPI and then
testing some pip installations on different platforms - chances are
that's going to be the best way to do the builds.

All the best!

ChrisA

Stefan Behnel

unread,
Jul 31, 2015, 4:40:38 AM7/31/15
to pytho...@python.org
Chris Angelico schrieb am 31.07.2015 um 09:37:
> On Fri, Jul 31, 2015 at 5:26 PM, Stefan Behnel wrote:
>> Your C code seems to be only about 1500 lines, not too late to translate
>> it. That should save you a couple of hundred lines and at the same time
>> make it work with Python 3 (which it currently doesn't, from what I see).
>
> To what extent does Cython make this easier? The biggest barrier I
> would expect to see is the bytes/text distinction

Yes, that tends to be a barrier. Cython is mostly just Python, so you can write

if isinstance(s, unicode):
s = (<unicode> s).encode('utf8')

and be happy with it ("<type>" is a cast in Cython). Such simple code looks
uglier when spelled out using the C-API and wouldn't be any CPU cycle faster.

But there's also the PyInt/PyLong unification, which can easily get in the
way for a number processing library. In Cython, you can write

if isinstance(x, (int, long)):
try:
c_long = <long> x
except OverflowError:
... # do slow conversion of large integer here
else:
... # do fast conversion from c_long here

or something like that and it'll work in Py2.6 through Py3.5 because Cython
does the necessary adaptations internally for you. This code snippet
already has a substantially faster fast-path than what the OP's code does
and it will still be much easier to tune later, in case you notice that the
slow path is too slow after all.

And then there are various helpful little features in the language like,
say, C arrays assigning by value, or freelists for extension types using a
decorator. The OP's code would clearly benefit from those, if only for
readability.

Python is much easier to write and maintain than C. Cython inherits that
property and expands it across C data types. And it generates C code for
you that automatically adapts to the different Python versions in various
ways, both in terms of compatibility and performance.

Stefan


baru...@gmail.com

unread,
Aug 1, 2015, 6:08:08 AM8/1/15
to
Hi, Thank you for your answer.

Actually this is the third version I am writing for using the QD library; the first onPython using ctypes; the second one was in Cython; this one is in C. I don't claim being a Cython expert and maybe my Cython code was not optimal but I can say the C version is significantly quicker (and the binary is about half the size of the Cython binary).

I admit Cython is a great idea but for this very specific project I like rather using C: behind the scene thre is no need to interact with Python; data only relies on elementary C types, etc. I decided to migrate from Cython to C when I realized that I was merely embedding C in Cython almost all the time. Furthermore, the QD library is old, stable and simple. Once my module will be ready it won't evolve much; thus maintaining the project shouldn't be an issue.

I am using Python 2 with the people I am working with and I release my work as it; maybe I will rewrite it for Python 3 one day but it is not my immediate purpose.

Last day I started implementing the new type as dtype for Numpy; for the moment, I disabled this part in the code (because it is far to be really usable) but it can already be tried for having a short glance by uncommenting one line in the code. Best regards. tb.
0 new messages