Numpy 1.6.0 beta 1 has just been released, and I was wondering what would it take to get support for it in Cython? Specifically, there's a new 16-bit floating point data type that I'd like to be able to use. A ticket for the feature in numpy is here:
But that doesn't contain links to the source change(s), just notes that "it's already done." The only other interesting link that I've been able to find is this thread:
> Numpy 1.6.0 beta 1 has just been released, and I was wondering what > would it take to get support for it in Cython? Specifically, there's > a new 16-bit floating point data type that I'd like to be able to use. > A ticket for the feature in numpy is here:
> But that doesn't contain links to the source change(s), just notes > that "it's already done." The only other interesting link that I've > been able to find is this thread:
I had to add a line to numpy's numpy/core/src/multiarray/buffer.c to get rid of "cannot include dtype 'e' in a buffer" errors. I'll submit a pull request once I'm more confident that there are no further changes needed. See: https://github.com/wickedgrey/numpy/commit/29f9f1b709cc2c346b8514859c...
The following code appears on lines 729+ of numpy/core/include/numpy/npy_common.h:
/* half/float16 isn't a floating-point type in C */ #define NPY_FLOAT16 NPY_HALF typedef npy_uint16 npy_half; typedef npy_half npy_float16;
Of note, Cython/Includes/numpy.pxd line 321: ctypedef unsigned short npy_float16
I'm not sure if there's a better way to do that, but it matches the definition of npy_uint16 this way.
However, at this point, when I try and use the data type, I get a bunch of errors of the form: "Cannot assign type 'float' to 'float16_t'" This makes sense, since float16_t is really an unsigned short. There are a number of functions declared in numpy/core/include/numpy/halffloat.h (function bodies in numpy/core/src/npymath/halffloat.c):
However, having to manually wrap every math operation with conversion functions is tedious and error-prone (esp. since the compiler will happily add two float16s as unsigned shorts and assign the result back to a float16, having botched the operation). Is there a way to add in the conversions automatically?
<rober...@math.washington.edu> wrote: > On Thu, Mar 24, 2011 at 12:25 PM, Eli Stevens (Gmail) > <wickedg...@gmail.com> wrote: >> Hello,
>> Numpy 1.6.0 beta 1 has just been released, and I was wondering what >> would it take to get support for it in Cython? Specifically, there's >> a new 16-bit floating point data type that I'd like to be able to use. >> A ticket for the feature in numpy is here:
>> But that doesn't contain links to the source change(s), just notes >> that "it's already done." The only other interesting link that I've >> been able to find is this thread:
On Fri, Mar 25, 2011 at 10:21 AM, Pauli Virtanen <p...@iki.fi> wrote: > The buffer interface cannot be used to export the half-float types, since > the type is not specified in PEP 3118. Numpy cannot unilaterally add > nonstandard format codes to the spec. ... > On the Cython side, you'd need to detect when you are working with Numpy > arrays, and get the half-float type information from the Numpy dtype > rather than from the exported buffer.
According to Pauli Virtanen, my approach of adding 'e' support to numpy's buffer code isn't acceptable. They might accept a NPY_HALF to 'H' (ie. uint16) version, but that still means that on the Cython side additional work would have to be done to differentiate float16 from uint16 (assuming that automatic wrapping of reads and writes to float16 arrays is possible).
> On Fri, Mar 25, 2011 at 10:21 AM, Pauli Virtanen<p...@iki.fi> wrote:
>> The buffer interface cannot be used to export the half-float types, since >> the type is not specified in PEP 3118. Numpy cannot unilaterally add >> nonstandard format codes to the spec.
> ...
>> On the Cython side, you'd need to detect when you are working with Numpy >> arrays, and get the half-float type information from the Numpy dtype >> rather than from the exported buffer.
> According to Pauli Virtanen, my approach of adding 'e' support to > numpy's buffer code isn't acceptable. They might accept a NPY_HALF to > 'H' (ie. uint16) version, but that still means that on the Cython side > additional work would have to be done to differentiate float16 from > uint16 (assuming that automatic wrapping of reads and writes to > float16 arrays is possible).
> How feasible would something like that be?
I think you're starting in the wrong end here: Even if NumPy and Cython could communicate that the contents of an array is float16, Cython would not have the slightest idea what to do with a float16. There is no support for float16 in common C compilers AFAIK, so it is not obvious that Cython should support it.
I.e., the way to do this at the moment is
cdef extern from "SomeWhereInNumPyIAmGuessing.h": cdef float half_to_single(uint16_t x)
> I had to add a line to numpy's numpy/core/src/multiarray/buffer.c to > get rid of "cannot include dtype 'e' in a buffer" errors. I'll submit > a pull request once I'm more confident that there are no further > changes needed. See: > https://github.com/wickedgrey/numpy/commit/29f9f1b709cc2c346b8514859c...
> The following code appears on lines 729+ of > numpy/core/include/numpy/npy_common.h:
> /* half/float16 isn't a floating-point type in C */ > #define NPY_FLOAT16 NPY_HALF > typedef npy_uint16 npy_half; > typedef npy_half npy_float16;
> Of note, Cython/Includes/numpy.pxd line 321: > ctypedef unsigned short npy_float16
> I'm not sure if there's a better way to do that, but it matches the > definition of npy_uint16 this way.
> However, at this point, when I try and use the data type, I get a > bunch of errors of the form: "Cannot assign type 'float' to > 'float16_t'" This makes sense, since float16_t is really an unsigned > short. There are a number of functions declared in > numpy/core/include/numpy/halffloat.h (function bodies in > numpy/core/src/npymath/halffloat.c):
> However, having to manually wrap every math operation with conversion > functions is tedious and error-prone (esp. since the compiler will > happily add two float16s as unsigned shorts and assign the result back > to a float16, having botched the operation). Is there a way to add in > the conversions automatically?
Sorry, I missed this part (because you top-posted), sorry about my other post. No, there isn't such a way currently. There's a reason NumPy defines float16 as unsigned short, and that is because C compilers don't have support for this. And Cython builds on C.
I'm not sure whether anything should be done, and if so, how to do it... note that NumPy is not a dependency of Cython and we try to not make it too NumPy-specific (in fact, after NumPy now supports PEP 3118, we don't need any special casing for NumPy in Cython at all, and it feels wrong to reintroduce it).
> On 03/25/2011 06:26 AM, Eli Stevens (Gmail) wrote: >> My progress so far:
>> I had to add a line to numpy's numpy/core/src/multiarray/buffer.c to >> get rid of "cannot include dtype 'e' in a buffer" errors. I'll submit >> a pull request once I'm more confident that there are no further >> changes needed. See: >> https://github.com/wickedgrey/numpy/commit/29f9f1b709cc2c346b8514859c...
>> The following code appears on lines 729+ of >> numpy/core/include/numpy/npy_common.h:
>> /* half/float16 isn't a floating-point type in C */ >> #define NPY_FLOAT16 NPY_HALF >> typedef npy_uint16 npy_half; >> typedef npy_half npy_float16;
>> Of note, Cython/Includes/numpy.pxd line 321: >> ctypedef unsigned short npy_float16
>> I'm not sure if there's a better way to do that, but it matches the >> definition of npy_uint16 this way.
>> However, at this point, when I try and use the data type, I get a >> bunch of errors of the form: "Cannot assign type 'float' to >> 'float16_t'" This makes sense, since float16_t is really an unsigned >> short. There are a number of functions declared in >> numpy/core/include/numpy/halffloat.h (function bodies in >> numpy/core/src/npymath/halffloat.c):
>> However, having to manually wrap every math operation with conversion >> functions is tedious and error-prone (esp. since the compiler will >> happily add two float16s as unsigned shorts and assign the result back >> to a float16, having botched the operation). Is there a way to add in >> the conversions automatically?
> Sorry, I missed this part (because you top-posted), sorry about my > other post. No, there isn't such a way currently. There's a reason > NumPy defines float16 as unsigned short, and that is because C > compilers don't have support for this. And Cython builds on C.
> I'm not sure whether anything should be done, and if so, how to do > it... note that NumPy is not a dependency of Cython and we try to not > make it too NumPy-specific (in fact, after NumPy now supports PEP > 3118, we don't need any special casing for NumPy in Cython at all, and > it feels wrong to reintroduce it).
Don't know whether this applies to you, but if you're memory or disk bound and not CPU then something you could try is to use Blosc to compress 32-bit floats. If you make sure to zero out the parts of the mantissa and exponent that you don't need you should get fairly decent compression. This is more optimal in some situations, but you must make sure to work on your data in small cache-sized blocks, so this may be even less convenient.
On Fri, Mar 25, 2011 at 12:48 PM, Dag Sverre Seljebotn
<d.s.seljeb...@astro.uio.no> wrote: > Sorry, I missed this part (because you top-posted), sorry about my other > post. No, there isn't such a way currently. There's a reason NumPy defines > float16 as unsigned short, and that is because C compilers don't have > support for this. And Cython builds on C.
I've been looking through the code, and I see the generate_buffer_setitem_code function. What I'm envisioning would be something like:
ptrexpr = self.buffer_lookup_code(code) if self.buffer_type.dtype.is_pyobject: ... elif self.buffer_type.dtype.needs_float16_handling: if op == '': code.putln("*%s = float_to_half(%s);" % (ptrexpr, rhs.result())) else: code.putln("*%s = float_to_half(half_to_float(%s) %s %s);" % (ptrexpr, ptrexpr, op, rhs.result())) else: # Simple case code.putln("*%s %s= %s;" % (ptrexpr, op, rhs.result()))
Where float_to_half, etc. have the same definition as the numpy functions npy_float_to_half, etc. Obviously there are going to be better ways to express the above (like a tmp var so we don't have to get the pointer twice), and it's not the only point in the code where changes would have to be made, but I think that it outlines the general idea.
Would something like this ever be accepted for inclusion into cython?
> I'm not sure whether anything should be done, and if so, how to do it... > note that NumPy is not a dependency of Cython and we try to not make it too > NumPy-specific (in fact, after NumPy now supports PEP 3118, we don't need > any special casing for NumPy in Cython at all, and it feels wrong to > reintroduce it).
If PEP 3118 gets updated to include a 'e' type, I don't think we'd have to reintroduce a dependency. There would need to be some code to convert between float32 and float16 (which numpy has ATM), but since the float16 bit layout is an IEEE standard, I don't see that as something that would be numpy specific.
Cheers, Eli
PS - my application is generally CPU bound, and is having problems fitting into the 2GB memory limit on windows XP, so compression isn't really an attractive option at this point. :(
<wickedg...@gmail.com> wrote: > On Fri, Mar 25, 2011 at 12:48 PM, Dag Sverre Seljebotn > <d.s.seljeb...@astro.uio.no> wrote: >> Sorry, I missed this part (because you top-posted), sorry about my other >> post. No, there isn't such a way currently. There's a reason NumPy defines >> float16 as unsigned short, and that is because C compilers don't have >> support for this. And Cython builds on C.
> I've been looking through the code, and I see the > generate_buffer_setitem_code function. What I'm envisioning would be > something like:
> Where float_to_half, etc. have the same definition as the numpy > functions npy_float_to_half, etc. Obviously there are going to be > better ways to express the above (like a tmp var so we don't have to > get the pointer twice), and it's not the only point in the code where > changes would have to be made, but I think that it outlines the > general idea.
> Would something like this ever be accepted for inclusion into cython?
Possibly, but I think it'd be a lot of special casing (though arguably less so than supporting object and bool). how would we distinguish between ushort and float16 arrays at runtime?
>> I'm not sure whether anything should be done, and if so, how to do it... >> note that NumPy is not a dependency of Cython and we try to not make it too >> NumPy-specific (in fact, after NumPy now supports PEP 3118, we don't need >> any special casing for NumPy in Cython at all, and it feels wrong to >> reintroduce it).
> If PEP 3118 gets updated to include a 'e' type, I don't think we'd > have to reintroduce a dependency. There would need to be some code to > convert between float32 and float16 (which numpy has ATM), but since > the float16 bit layout is an IEEE standard, I don't see that as > something that would be numpy specific.
True.
Also, I've been thinking about support for numpy iterators, and can't see a way around making numpy a dependency there, but we'd like to minimize it.
> PS - my application is generally CPU bound, and is having problems > fitting into the 2GB memory limit on windows XP, so compression isn't > really an attractive option at this point. :(
That's unfortunate, especially given how cheap memory is these days. Can you break your computation up into chunks? Even using 16-bit floats would only buy you a 2x increase, and would probably make your application even more CPU bound (twiddling bits with every assignment, separate integer vs. floating point registrars/pipelines, ...) which might not be entirely offset by the memory savings.
<rober...@math.washington.edu> wrote: > Possibly, but I think it'd be a lot of special casing (though arguably > less so than supporting object and bool). how would we distinguish > between ushort and float16 arrays at runtime?
Well, there's been some positive feedback to adding float16 support to the struct module (which the buffer interface references and extends), so presumably there'd be an 'e' type for the buffer/array that could be used to signal the need for special case additional handling.
Where is the appropriate place in the source tree to add unit tests for something like this?
> Numpy 1.6.0 beta 1 has just been released, and I was wondering what > would it take to get support for it in Cython? Specifically, there's > a new 16-bit floating point data type that I'd like to be able to use. > A ticket for the feature in numpy is here:
> But that doesn't contain links to the source change(s), just notes > that "it's already done." The only other interesting link that I've > been able to find is this thread:
> On Fri, Mar 25, 2011 at 12:48 PM, Dag Sverre Seljebotn > <d.s.seljeb...@astro.uio.no> wrote: >> Sorry, I missed this part (because you top-posted), sorry about my other >> post. No, there isn't such a way currently. There's a reason NumPy defines >> float16 as unsigned short, and that is because C compilers don't have >> support for this. And Cython builds on C. > I've been looking through the code, and I see the > generate_buffer_setitem_code function. What I'm envisioning would be > something like:
> Where float_to_half, etc. have the same definition as the numpy > functions npy_float_to_half, etc. Obviously there are going to be > better ways to express the above (like a tmp var so we don't have to > get the pointer twice), and it's not the only point in the code where > changes would have to be made, but I think that it outlines the > general idea.
> Would something like this ever be accepted for inclusion into cython?
I've been thinking and am now in favour of something like this. Biggest problem is how "needs_float16_handling" would be flagged in a type -- one needs syntax for "half float" that could be used to define "np.float16_t" in numpy.pxd, and such typedefs could not be used anywhere else ("cdef np.float16_t x" would be disallowed).
Tests: Options: a) Create a new file using a mock object (see /tests/run/bufaccess.pyx) e.g., /tests/run/halffloat.pyx. This would work everywhere. b) Modify the docstring of /tests/run/numpy_test.pyx at runtime conditional on the NumPy version being recent enough for the test.
Also, for a relatively obscure feature like this, docs for docs.cython.org in the same pull request would be greatly appreciated (it's in the same git repo now).
> On 03/28/2011 10:15 PM, Eli Stevens (Gmail) wrote:
>> On Fri, Mar 25, 2011 at 12:48 PM, Dag Sverre Seljebotn >> <d.s.seljeb...@astro.uio.no> wrote:
>>> Sorry, I missed this part (because you top-posted), sorry about my other >>> post. No, there isn't such a way currently. There's a reason NumPy >>> defines >>> float16 as unsigned short, and that is because C compilers don't have >>> support for this. And Cython builds on C.
>> I've been looking through the code, and I see the >> generate_buffer_setitem_code function. What I'm envisioning would be >> something like:
>> Where float_to_half, etc. have the same definition as the numpy >> functions npy_float_to_half, etc. Obviously there are going to be >> better ways to express the above (like a tmp var so we don't have to >> get the pointer twice), and it's not the only point in the code where >> changes would have to be made, but I think that it outlines the >> general idea.
>> Would something like this ever be accepted for inclusion into cython?
> I've been thinking and am now in favour of something like this. Biggest > problem is how "needs_float16_handling" would be flagged in a type -- one > needs syntax for "half float" that could be used to define "np.float16_t" in > numpy.pxd, and such typedefs could not be used anywhere else ("cdef > np.float16_t x" would be disallowed).
Why such restrictions? Cython could be extended with a "half float" C datatype, what's wrong with that?
-- Lisandro Dalcin --------------- CIMEC (INTEC/CONICET-UNL) Predio CONICET-Santa Fe Colectora RN 168 Km 472, Paraje El Pozo 3000 Santa Fe, Argentina Tel: +54-342-4511594 (ext 1011) Tel/Fax: +54-342-4511169
On Mon, Apr 4, 2011 at 5:41 PM, Lisandro Dalcin <dalc...@gmail.com> wrote: > Why such restrictions? Cython could be extended with a "half float" C > datatype, what's wrong with that?
Per my understanding, most compiler+platform combinations don't support a C float16 data type.
gcc+ARM does, but I suspect that's because there's hardware support for it.
On 5 April 2011 00:26, Eli Stevens (Gmail) <wickedg...@gmail.com> wrote:
> On Mon, Apr 4, 2011 at 5:41 PM, Lisandro Dalcin <dalc...@gmail.com> wrote: >> Why such restrictions? Cython could be extended with a "half float" C >> datatype, what's wrong with that?
> Per my understanding, most compiler+platform combinations don't > support a C float16 data type.
> gcc+ARM does, but I suspect that's because there's hardware support for it.
But I'm thinking about using a unsigned short behind the scenes, add a couple or coercing routines to "float", and perhaps implement binary operators in single precision (not sure about this, after all float16 is supposed to be an storage type).
-- Lisandro Dalcin --------------- CIMEC (INTEC/CONICET-UNL) Predio CONICET-Santa Fe Colectora RN 168 Km 472, Paraje El Pozo 3000 Santa Fe, Argentina Tel: +54-342-4511594 (ext 1011) Tel/Fax: +54-342-4511169
I'm starting to read the Cython source to determine where the best place to put in hooks for this feature is. So far, the best approach that I have come up with is to subclass CoercionNode for use by the coerce_to function, etc.
On Tue, Apr 5, 2011 at 5:22 PM, Eli Stevens (Gmail)
<wickedg...@gmail.com> wrote: > I'm starting to read the Cython source to determine where the best > place to put in hooks for this feature is. So far, the best approach > that I have come up with is to subclass CoercionNode for use by the > coerce_to function, etc.
> Does that seem like a reasonable place to start?
I don't think subclassing is the way to go. Given that float16 is primarily a storage format, one option is to only support it for numpy access (i.e. encode/decode on buffer indexing, but otherwise let it be an alias for float in the source.) This would have strange semantics WRT sizeof(float16) and &float16. Alternatively "float16" wouldn't even need to be a valid data type outside of buffers, so
The other way is to let it alias int16 (is short always safe to use for that?) and add special cases to each of the arithmetic operators and coercion, though this would result in highly inefficient code for expressions like "a*x + b*y + c." Probably the best model is to look at how complex numbers are implemented, as they faced similar issues.
<rober...@math.washington.edu> wrote: > I don't think subclassing is the way to go. Given that float16 is > primarily a storage format, one option is to only support it for numpy > access (i.e. encode/decode on buffer indexing, but otherwise let it be > an alias for float in the source.) This would have strange semantics > WRT sizeof(float16) and &float16. Alternatively "float16" wouldn't > even need to be a valid data type outside of buffers, so
I started to see if I could just wrap buffer access, but my recollection (from a week or so ago) was that the buffer access code just returned a pointer to the appropriate place in the buffer, with no indication of how it was going to be used, making it hard to supply the appropriate packing or unpacking code. Is there an obvious way around that? I'll admit that I certainly don't grok all of cython yet. ;)
I'll also take a look at how complex numbers are handled.
I've traced through the code as best I can and added what seemed like it would be needed, but I don't really trust that I've done that well. I'm finding it difficult to desk-check the code; there are a lot of times when I run into code like (ExprNodes.py, line 2182):
if buffer_access: self.indices = indices self.index = None self.type = self.base.type.dtype
And trying to track down what, exactly, self.base.type.dtype is turns out to be very difficult if (like me) you're not familiar with the code base. From the line where dtype gets set it goes through a few more function calls, then over into interpret_compiletime_options, then into the parser, etc. From there, it winds through a few more functions before ending up in p_name. From there you have to figure out what is in s.compile_time_env, etc. etc.
=== Got errors: === 137:40: Invalid operand types for '>>' (short; long) 185:48: Invalid operand types for '>>' (short; long) 233:13: Cannot convert 'short' to Python object 252:11: Cannot convert 'short' to Python object 263:29: Cannot convert Python object to 'short' 264:11: Cannot convert 'short' to Python object ... i686-apple-darwin10-gcc-4.0.1: Python.framework/Versions/2.7/Python: No such file or directory make: *** [embedded] Error 1 ====================================================================== ERROR: runTest (__main__.CythonRunTestCase) compiling (c) and running bufaccess ---------------------------------------------------------------------- Traceback (most recent call last): File "runtests.py", line 496, in run self.runCompileTest() File "runtests.py", line 327, in runCompileTest self.test_directory, self.expect_errors, self.annotate) File "runtests.py", line 473, in compile self.assertEquals(None, unexpected_error) AssertionError: None != u"523:13: Cannot convert 'short' to Python object" ... ====================================================================== ERROR: runTest (__main__.CythonRunTestCase) compiling (c) and running index ---------------------------------------------------------------------- Traceback (most recent call last): File "runtests.py", line 496, in run self.runCompileTest() File "runtests.py", line 327, in runCompileTest self.test_directory, self.expect_errors, self.annotate) File "runtests.py", line 473, in compile self.assertEquals(None, unexpected_error) AssertionError: None != u'94:33: Compiler crash in AnalyseExpressionsTransform' ... ====================================================================== FAIL: numpy_test () Doctest: numpy_test ---------------------------------------------------------------------- Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/doctest.py ", line 2153, in runTest raise self.failureException(self.format_failure(new.getvalue())) AssertionError: Failed doctest test for numpy_test File "/Users/elis/edit/work/cython/BUILD/run/c/numpy_test.so", line 190, in numpy_test
---------------------------------------------------------------------- File "/Users/elis/edit/work/cython/BUILD/run/c/numpy_test.so", line 295, in numpy_test Failed example: test_dtype('H', inc1_ushort) Exception raised: Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/doctest.py ", line 1248, in __run compileflags, 1) in test.globs File "<doctest numpy_test[35]>", line 1, in <module> test_dtype('H', inc1_ushort) File "numpy_test.pyx", line 342, in numpy_test.test_dtype (numpy_test.c:5451) File "numpy_test.pyx", line 280, in numpy_test.inc1_ushort (numpy_test.c:3110) ValueError: Buffer dtype mismatch, expected 'short' but got 'unsigned short' ---------------------------------------------------------------------- File "/Users/elis/edit/work/cython/BUILD/run/c/numpy_test.so", line 301, in numpy_test Failed example: test_dtype('e', inc1_float16) Expected nothing Got: failed!
Obviously I've cut a fair bit of output, but I've tried to keep the unique errors.
One of the problems that's stumping me is that I only see build/run/c/unsigned.c etc. as files with an error message, so I can't figure out what the compilation problem is. Basically, I've just hit the wall, and don't know where to go from here. Anyone able to help?
First, is it possible to eliminate all those whitespace changes from the commits? They're a lot harder to follow on github when I need to manually seperate the non-important whitespace changes from the real changes...
> I've traced through the code as best I can and added what seemed like > it would be needed, but I don't really trust that I've done that well. > I'm finding it difficult to desk-check the code; there are a lot of > times when I run into code like (ExprNodes.py, line 2182):
> if buffer_access: > self.indices = indices > self.index = None > self.type = self.base.type.dtype
> And trying to track down what, exactly, self.base.type.dtype is turns > out to be very difficult if (like me) you're not familiar with the > code base. From the line where dtype gets set it goes through a few
I'm not sure what you're asking...I tend to just insert code like this
It would be nice with nicely documented invariants for everything between each compilation stage, but that's just not the case. I agree that the code base is kind of hard to get into.
> more function calls, then over into interpret_compiletime_options, > then into the parser, etc. From there, it winds through a few more > functions before ending up in p_name. From there you have to figure > out what is in s.compile_time_env, etc. etc.
> Is there a better way to approach the code?
> When I run the tests, I see a lot of things like:
> === Got errors: === > 137:40: Invalid operand types for '>>' (short; long) > 185:48: Invalid operand types for '>>' (short; long) > 233:13: Cannot convert 'short' to Python object > 252:11: Cannot convert 'short' to Python object > 263:29: Cannot convert Python object to 'short' > 264:11: Cannot convert 'short' to Python object > ... > i686-apple-darwin10-gcc-4.0.1: Python.framework/Versions/2.7/Python: > No such file or directory > make: *** [embedded] Error 1 > ====================================================================== > ERROR: runTest (__main__.CythonRunTestCase) > compiling (c) and running bufaccess > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "runtests.py", line 496, in run > self.runCompileTest() > File "runtests.py", line 327, in runCompileTest > self.test_directory, self.expect_errors, self.annotate) > File "runtests.py", line 473, in compile > self.assertEquals(None, unexpected_error) > AssertionError: None != u"523:13: Cannot convert 'short' to Python object" > ... > ====================================================================== > ERROR: runTest (__main__.CythonRunTestCase) > compiling (c) and running index > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "runtests.py", line 496, in run > self.runCompileTest() > File "runtests.py", line 327, in runCompileTest > self.test_directory, self.expect_errors, self.annotate) > File "runtests.py", line 473, in compile > self.assertEquals(None, unexpected_error) > AssertionError: None != u'94:33: Compiler crash in AnalyseExpressionsTransform' > ... > ====================================================================== > FAIL: numpy_test () > Doctest: numpy_test > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/doctest.py ", > line 2153, in runTest > raise self.failureException(self.format_failure(new.getvalue())) > AssertionError: Failed doctest test for numpy_test > File "/Users/elis/edit/work/cython/BUILD/run/c/numpy_test.so", line > 190, in numpy_test
> ---------------------------------------------------------------------- > File "/Users/elis/edit/work/cython/BUILD/run/c/numpy_test.so", line > 295, in numpy_test > Failed example: > test_dtype('H', inc1_ushort) > Exception raised: > Traceback (most recent call last): > File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/doctest.py ", > line 1248, in __run > compileflags, 1) in test.globs > File "<doctest numpy_test[35]>", line 1, in<module> > test_dtype('H', inc1_ushort) > File "numpy_test.pyx", line 342, in numpy_test.test_dtype > (numpy_test.c:5451) > File "numpy_test.pyx", line 280, in numpy_test.inc1_ushort > (numpy_test.c:3110) > ValueError: Buffer dtype mismatch, expected 'short' but got 'unsigned short' > ---------------------------------------------------------------------- > File "/Users/elis/edit/work/cython/BUILD/run/c/numpy_test.so", line > 301, in numpy_test > Failed example: > test_dtype('e', inc1_float16) > Expected nothing > Got: > failed!
> Obviously I've cut a fair bit of output, but I've tried to keep the > unique errors.
> One of the problems that's stumping me is that I only see > build/run/c/unsigned.c etc. as files with an error message, so I can't > figure out what the compilation problem is. Basically, I've just hit > the wall, and don't know where to go from here. Anyone able to help?
Well, explain this to me. How is a variable supposed to be typed as half float? I see code such as
ctypedef unsigned short npy_float16 def inc1_float16(np.ndarray[np.float16_t] arr): arr[1] += 1
which wouldn't do anything to get c_halffloat_type involved. Only relevant line I see is
(0, -1, "int"): c_halffloat_type,
but I'm not sure what that line should achieve...are you trying to support
> First, is it possible to eliminate all those whitespace changes from the > commits? They're a lot harder to follow on github when I need to > manually seperate the non-important whitespace changes from the real > changes...
>> I've traced through the code as best I can and added what seemed like >> it would be needed, but I don't really trust that I've done that well. >> I'm finding it difficult to desk-check the code; there are a lot of >> times when I run into code like (ExprNodes.py, line 2182):
>> if buffer_access: >> self.indices = indices >> self.index = None >> self.type = self.base.type.dtype
>> And trying to track down what, exactly, self.base.type.dtype is turns >> out to be very difficult if (like me) you're not familiar with the >> code base. From the line where dtype gets set it goes through a few
> I'm not sure what you're asking...I tend to just insert code like this
> It would be nice with nicely documented invariants for everything > between each compilation stage, but that's just not the case. I agree > that the code base is kind of hard to get into.
>> more function calls, then over into interpret_compiletime_options, >> then into the parser, etc. From there, it winds through a few more >> functions before ending up in p_name. From there you have to figure >> out what is in s.compile_time_env, etc. etc.
>> Is there a better way to approach the code?
>> When I run the tests, I see a lot of things like:
>> === Got errors: === >> 137:40: Invalid operand types for '>>' (short; long) >> 185:48: Invalid operand types for '>>' (short; long) >> 233:13: Cannot convert 'short' to Python object >> 252:11: Cannot convert 'short' to Python object >> 263:29: Cannot convert Python object to 'short' >> 264:11: Cannot convert 'short' to Python object >> ... >> i686-apple-darwin10-gcc-4.0.1: Python.framework/Versions/2.7/Python: >> No such file or directory >> make: *** [embedded] Error 1 >> ====================================================================== >> ERROR: runTest (__main__.CythonRunTestCase) >> compiling (c) and running bufaccess >> ---------------------------------------------------------------------- >> Traceback (most recent call last): >> File "runtests.py", line 496, in run >> self.runCompileTest() >> File "runtests.py", line 327, in runCompileTest >> self.test_directory, self.expect_errors, self.annotate) >> File "runtests.py", line 473, in compile >> self.assertEquals(None, unexpected_error) >> AssertionError: None != u"523:13: Cannot convert 'short' to Python >> object" >> ... >> ====================================================================== >> ERROR: runTest (__main__.CythonRunTestCase) >> compiling (c) and running index >> ---------------------------------------------------------------------- >> Traceback (most recent call last): >> File "runtests.py", line 496, in run >> self.runCompileTest() >> File "runtests.py", line 327, in runCompileTest >> self.test_directory, self.expect_errors, self.annotate) >> File "runtests.py", line 473, in compile >> self.assertEquals(None, unexpected_error) >> AssertionError: None != u'94:33: Compiler crash in >> AnalyseExpressionsTransform' >> ... >> ====================================================================== >> FAIL: numpy_test () >> Doctest: numpy_test >> ---------------------------------------------------------------------- >> Traceback (most recent call last): >> File >> "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/doctest.py ",
>> line 2153, in runTest >> raise self.failureException(self.format_failure(new.getvalue())) >> AssertionError: Failed doctest test for numpy_test >> File "/Users/elis/edit/work/cython/BUILD/run/c/numpy_test.so", line >> 190, in numpy_test
>> line 1248, in __run >> compileflags, 1) in test.globs >> File "<doctest numpy_test[35]>", line 1, in<module> >> test_dtype('H', inc1_ushort) >> File "numpy_test.pyx", line 342, in numpy_test.test_dtype >> (numpy_test.c:5451) >> File "numpy_test.pyx", line 280, in numpy_test.inc1_ushort >> (numpy_test.c:3110) >> ValueError: Buffer dtype mismatch, expected 'short' but got 'unsigned >> short' >> ---------------------------------------------------------------------- >> File "/Users/elis/edit/work/cython/BUILD/run/c/numpy_test.so", line >> 301, in numpy_test >> Failed example: >> test_dtype('e', inc1_float16) >> Expected nothing >> Got: >> failed!
>> Obviously I've cut a fair bit of output, but I've tried to keep the >> unique errors.
>> One of the problems that's stumping me is that I only see >> build/run/c/unsigned.c etc. as files with an error message, so I can't >> figure out what the compilation problem is. Basically, I've just hit >> the wall, and don't know where to go from here. Anyone able to help?
> Well, explain this to me. How is a variable supposed to be typed as half > float? I see code such as
Put another way: If you get the above code to work with half-floats, you're bound to have broken our support for unsigned short, since there's nothing to seperate the two in those lines of code.
Given the lack of a native C type for half float, and the steep learning curve of the codebase, I would make half_float an alias for float, with the only difference being an extra conversion when getting an element into/out of a buffer (and buffer unpacking of course). Inplace arithmetic operations may have to be unavailable on the first pass.
- Robert
On Thu, Apr 21, 2011 at 3:14 PM, Eli Stevens (Gmail)
> I've traced through the code as best I can and added what seemed like > it would be needed, but I don't really trust that I've done that well. > I'm finding it difficult to desk-check the code; there are a lot of > times when I run into code like (ExprNodes.py, line 2182):
> if buffer_access: > self.indices = indices > self.index = None > self.type = self.base.type.dtype
> And trying to track down what, exactly, self.base.type.dtype is turns > out to be very difficult if (like me) you're not familiar with the > code base. From the line where dtype gets set it goes through a few > more function calls, then over into interpret_compiletime_options, > then into the parser, etc. From there, it winds through a few more > functions before ending up in p_name. From there you have to figure > out what is in s.compile_time_env, etc. etc.
> Is there a better way to approach the code?
> When I run the tests, I see a lot of things like:
> === Got errors: === > 137:40: Invalid operand types for '>>' (short; long) > 185:48: Invalid operand types for '>>' (short; long) > 233:13: Cannot convert 'short' to Python object > 252:11: Cannot convert 'short' to Python object > 263:29: Cannot convert Python object to 'short' > 264:11: Cannot convert 'short' to Python object > ... > i686-apple-darwin10-gcc-4.0.1: Python.framework/Versions/2.7/Python: > No such file or directory > make: *** [embedded] Error 1 > ====================================================================== > ERROR: runTest (__main__.CythonRunTestCase) > compiling (c) and running bufaccess > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "runtests.py", line 496, in run > self.runCompileTest() > File "runtests.py", line 327, in runCompileTest > self.test_directory, self.expect_errors, self.annotate) > File "runtests.py", line 473, in compile > self.assertEquals(None, unexpected_error) > AssertionError: None != u"523:13: Cannot convert 'short' to Python object" > ... > ====================================================================== > ERROR: runTest (__main__.CythonRunTestCase) > compiling (c) and running index > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "runtests.py", line 496, in run > self.runCompileTest() > File "runtests.py", line 327, in runCompileTest > self.test_directory, self.expect_errors, self.annotate) > File "runtests.py", line 473, in compile > self.assertEquals(None, unexpected_error) > AssertionError: None != u'94:33: Compiler crash in AnalyseExpressionsTransform' > ... > ====================================================================== > FAIL: numpy_test () > Doctest: numpy_test > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/doctest.py ", > line 2153, in runTest > raise self.failureException(self.format_failure(new.getvalue())) > AssertionError: Failed doctest test for numpy_test > File "/Users/elis/edit/work/cython/BUILD/run/c/numpy_test.so", line > 190, in numpy_test
> ---------------------------------------------------------------------- > File "/Users/elis/edit/work/cython/BUILD/run/c/numpy_test.so", line > 295, in numpy_test > Failed example: > test_dtype('H', inc1_ushort) > Exception raised: > Traceback (most recent call last): > File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/doctest.py ", > line 1248, in __run > compileflags, 1) in test.globs > File "<doctest numpy_test[35]>", line 1, in <module> > test_dtype('H', inc1_ushort) > File "numpy_test.pyx", line 342, in numpy_test.test_dtype > (numpy_test.c:5451) > File "numpy_test.pyx", line 280, in numpy_test.inc1_ushort > (numpy_test.c:3110) > ValueError: Buffer dtype mismatch, expected 'short' but got 'unsigned short' > ---------------------------------------------------------------------- > File "/Users/elis/edit/work/cython/BUILD/run/c/numpy_test.so", line > 301, in numpy_test > Failed example: > test_dtype('e', inc1_float16) > Expected nothing > Got: > failed!
> Obviously I've cut a fair bit of output, but I've tried to keep the > unique errors.
> One of the problems that's stumping me is that I only see > build/run/c/unsigned.c etc. as files with an error message, so I can't > figure out what the compilation problem is. Basically, I've just hit > the wall, and don't know where to go from here. Anyone able to help?
On Fri, Apr 22, 2011 at 12:16 AM, Dag Sverre Seljebotn
<d.s.seljeb...@astro.uio.no> wrote: > Put another way: If you get the above code to work with half-floats, you're > bound to have broken our support for unsigned short, since there's nothing > to seperate the two in those lines of code.
The intent was to use the buffer type ('e' instead of 'h' or 'H') to flag the node(?) as needing packing / unpacking. It doesn't sound like I've accomplished that, though.
On Fri, Apr 22, 2011 at 2:19 AM, Robert Bradshaw
<rober...@math.washington.edu> wrote: > Given the lack of a native C type for half float, and the steep > learning curve of the codebase, I would make half_float an alias for > float, with the only difference being an extra conversion when getting > an element into/out of a buffer (and buffer unpacking of course). > Inplace arithmetic operations may have to be unavailable on the first > pass.
That's what I was trying to do; however the buffer get item code just returns a pointer to the appropriate spot in memory, with no indication of how it's going to get used (ie. does it need to be unpacked, or packed?). I tried to solve that with the LHS / RHS coercion stuff.
For my use case, having access to a numpy buffer full of half-floats without having to do the packing and unpacking to float32 manually on access is all I need. I agree that trying to do half-float-math isn't really worth it.
I'll see what I can do about the whitespace changes (coding standards here at work are to strip trailing whitespace, which I have my editor do automatically; I committed somewhat hastily).
On 04/22/2011 09:07 PM, Eli Stevens (Gmail) wrote:
> On Fri, Apr 22, 2011 at 12:16 AM, Dag Sverre Seljebotn > <d.s.seljeb...@astro.uio.no> wrote: >> Put another way: If you get the above code to work with half-floats, you're >> bound to have broken our support for unsigned short, since there's nothing >> to seperate the two in those lines of code.
> The intent was to use the buffer type ('e' instead of 'h' or 'H') to > flag the node(?) as needing packing / unpacking. It doesn't sound > like I've accomplished that, though.
The buffer format string is only available at run-time -- the passed in object can return any format string in principle (and Cython raises an error if it doesn't match compile-time assumptions). However, you need to decide whether to insert the conversion code or not into C code at compile time!
IOW, you need some kind of "half_float" or "short float" so that one can decide this at Cython compile time:
> On Fri, Apr 22, 2011 at 2:19 AM, Robert Bradshaw > <rober...@math.washington.edu> wrote: >> Given the lack of a native C type for half float, and the steep >> learning curve of the codebase, I would make half_float an alias for >> float, with the only difference being an extra conversion when getting >> an element into/out of a buffer (and buffer unpacking of course). >> Inplace arithmetic operations may have to be unavailable on the first >> pass.
> That's what I was trying to do; however the buffer get item code just > returns a pointer to the appropriate spot in memory, with no > indication of how it's going to get used (ie. does it need to be > unpacked, or packed?). I tried to solve that with the LHS / RHS > coercion stuff.
> For my use case, having access to a numpy buffer full of half-floats > without having to do the packing and unpacking to float32 manually on > access is all I need. I agree that trying to do half-float-math isn't > really worth it.
> I'll see what I can do about the whitespace changes (coding standards > here at work are to strip trailing whitespace, which I have my editor > do automatically; I committed somewhat hastily).
For instance, check out your master branch, make a commit stripping trailing whitespace, and then rebase your branch on top of it (git rebase master).
On Fri, Apr 22, 2011 at 12:14 PM, Dag Sverre Seljebotn
<d.s.seljeb...@astro.uio.no> wrote: > The buffer format string is only available at run-time -- the passed in > object can return any format string in principle (and Cython raises an error > if it doesn't match compile-time assumptions). However, you need to decide > whether to insert the conversion code or not into C code at compile time!
Thinking about it, I see what you mean - one of those "oh, yeah, right, why didn't I see that?" kind of things. I had originally been wanting to use the information in:
numpy.ndarray[numpy.float16, ndim=3, mode="c"]
As the way to tell, but that just looks like a short to cython, doesn't it? Would it be possible to add something like:
To add in the packing and unpacking code? That would make half float buffers something of a second-place citizen, but I'd be able to work with that (we declare all of our numpy array types anyway).