On May 7, 2010, at 1:39 PM, neurino wrote:
> > All your loops are bright yellow, so I think you're trying to
> optimize in the wrong places. For example, your very inner loops
> consist of
> > n in grid[subslice]
>
> I know, that's why I'm using numpy: to take advantage of subslicing,
> otherwise I'd use directly standard arrays...
Even if NumPy slices were optimized, there's no getting around the
fact that you're allocating new objects here (though they do share
data memory). Then when you call .sum() on it you're allocating
another Python object. As it is, to even invoke the slice you have to
allocate a tuple and indices as Python objects too. When your arrays
have hundreds, or millions, of items, slicing is cheap compared to
iterating over the array. When they just contain a dozen or two, that
fixed cost is relatively expensive.
It's conceivable that the compound statement could be optimized, but
we're not to that point yet. As you mentioned, you could unroll these
manually (though it would impact the clarity of your code). This could
be a bit cleaner once we have closures and/or buffers in cdef functions.
> I always checked the html "cython -a" creates.
>
> I also tried to type grid and candidates as np.ndarray[np.int_t,
> ndim=2] and np.ndarray[np.int_t, ndim=3] but get this error:
>
> E:\pysu>cython -a sudoku.pyx
>
> Error converting Pyrex file to C:
> ------------------------------------------------------------
> ...
> self.tries = 0
> self.solved = self.__solve(self.grid)
>
> #@cython.cdivision(True)
> #@cython.boundscheck(False)
> cdef np.ndarray[DTYPE_t, ndim=2] __solve(self,
> np.ndarray[DTYPE_t, ndim=2] grid):
> ^
> ------------------------------------------------------------
>
> E:\pysu\sudoku.pyx:24:75: Expected ']'
You can't use buffer types in cdef functions (yet), just make it a def
function (as it's not called that often relative to all else that's
going on). The same is true of class members.
>
> and also later:
>
> 'ndarray' is not a type identifier
>
> I made a commit with the code that generates errors, if you can try
> it and help me to understand what's wrong.
>
> You wrote I did not take advantage of the fact I typed variables and
> did not take advantage, please explain me how to do it since I'm
> quite sure I did not understand that too.
Because the array type was undeclared, all your indexes were still
passing through Python. BTW, just running your original .py file
through Cython, no changes, gave me a 5% speedup on my machine. I
added a couple of types for a 25% speedup:
Python 0.20190050602
Cython 0.151217198372
http://sage.math.washington.edu/home/robertwb/cython/pysu.0.0.1/sudoku.diff
There's no getting a 10x speedup without avoiding all that slow
slicing though.
- Robert