Daniele Nicolodi, 17.04.2013 16:02:
> On 17/04/2013 15:46, Stefan Behnel wrote:
>> Daniele Nicolodi, 17.04.2013 15:32:
>>> On 17/04/2013 15:17, Stefan Behnel wrote:
>>>> True. Optimisations are generally not documented because they'd just
>>>> unnecessarily bloat the documentation with useless information. Seriously,
>>>> what does it add for users to know that Cython has its own way of
>>>> implementing a given syntax construct or a given builtin?
>>>>
>>>> range() is an exception here, because we were actively trying to get users
>>>> away from the for-from loop syntax that Pyrex introduced.
>>>
>>> The fact is that looking at the generated C code I observed that the
>>> enumerate() was not optimized as range() is. I went to the documentation
>>> looking for an explanation and I found mention of range() only. This
>>> drew me to think that enumerate() is not optimized.
>>>
>>> I agree that listing all optimizations in the documentation is of little
>>> use, but in the case where builtins are replaced with a different
>>> implementation this makes much more sense because it changes radically
>>> how the users write their code.
>>
>> It shouldn't.
>
> If it should not, why did you document range() ?
Because people were using for-from instead of the normal Python way of
doing it, which is for-in-range. And they shouldn't. Same thing.
But it's hard to get bad habits out of a) the heads of users and b) the
Internet once they've escaped into the wild.
If you find a better way to express in the docs that for-in-range() is the
right way to do an integer for-loop in Python *and* Cython, we are always
happy about pull requests. Note that the page you are referring to is
specifically there to describe differences between Cython and Pyrex. In
Pyrex, for-from was a necessary evil and people learned to use it. In
Cython, it's deprecated because it's a syntactic wart that's both ugly and
redundant. That's a major difference that is worth documenting, don't you
think?
enumerate() is neither ugly nor redundant nor deprecated and is documented
in the official Python docs. So why should we document it for Cython?
>>> def test():
>>> cdef unsigned int k
>>> cdef double element
>>> cdef np.ndarray[np.double_t, ndim=1] v1 \
>>> = np.empty([10, ], np.double)
>>> cdef np.ndarray[np.double_t, ndim=1] v2 \
>>> = np.empty([10, ], np.double)
>>>
>>> for i, element in enumerate(v1):
>>> v2[i] = element * i
>>
>> Works for me (after adding the missing imports). I don't get a call to
>> enumerate() in the C code. Instead, it uses a Python variable as counter
>> and adds 1 to it in each step.
>
> Indeed, there is no call to enumerate() but it does not make use of the
> optimizations for accessing the numpy arrays. I would expect the
> enumerate() loop to result in code very similar to the one produced by:
>
> for i in range(10):
> element = v1[i]
> v2[i] = element * i
Ah, that makes it clearer what you meant. You weren't actually talking
about enumerate() at all. What you meant was that when you iterate over a
NumPy array, the loop doesn't use efficient indexing. So the actual problem
is array iteration, not enumerate().
Removing the indirection through enumerate() makes this clearer:
for element in v1:
pass
Now the question is: how should Cython know that ndarray objects implement
their iteration by indexing into their buffer?
Stefan