Thanks for the report, it currently is indeed not as efficient as it
should be. Cython should perform many more optimizations like bounds
check optimizations and other loop optimizations as well as many
others. At least this problem can be fixed quite easily, so we'll try
fixing that for the release.
Generally speaking, the acquisition counting (reference counting for
these slices) should be more efficient and smarter and not rely on
atomics or locks, but use a more GC-like approach. In any case, the
copying overhead could be reduced by creating a new type for each
N-dimensional memoryview, which could support any N without overhead
for the other memoryviews. Both these approaches are somewhat more
involved, so will have to wait until someone is up for the task.
I fixed it, I get the following results:
[1]: from 0.275564s to 0.021208s
This is even faster (0.012339s):
m.x[:] = 2.0
You can find the fixes in this branch:
https://github.com/markflorisson88/cython/tree/release
/* "test.pyx":17
*
* for i in range(10):
* m.func()[i]=1 # <<<<<<<<<<<<<<
*
* print a
*/
__pyx_t_4 = ((struct __pyx_vtabstruct_4test_Mem_slice *)__pyx_v_4test_m->__pyx_vtab)->func(__pyx_v_4test_m); if (unlikely(!__pyx_t_4.memview)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 17; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
__pyx_t_5 = __pyx_v_4test_i;
*((double *) ( /* dim=0 */ ((char *) (((double *) __pyx_t_4.data) + __pyx_t_5)) )) = 1.0;
__PYX_XDEC_MEMVIEW(&__pyx_t_4, 1); //here is the bug, only decrease without increase the references?Best Regards,
Liu zhenhai
Thanks for the report, I actually fixed that today, could you retry
from my branch?
2012/4/10 刘振海 <1989...@gmail.com>:
> Hi