I found this kind of strange error. Thank you in advance.
Error compiling Cython file:
------------------------------------------------------------
...
cdef double sum = 0.0
for i in prange(N, nogil=1):
if x[i] < 0.5:
sum += x[i]
else:
sum -= x[i]
^
------------------------------------------------------------
m2.pyx:29:16: Reduction operator '-' is inconsistent with previous
reduction operator '+'
best wishes,
Ting
Yes, this is expected, because mixing reduction operators is not allowed
(something we inherit from OpenMP). Please write
sum += -x[i].
Perhaps in the future we will do this transform automatically. (Same
with * vs. /. Mixing + and * or similar would obviously not be
parallelizable.)
Dag Sverre
Does Cython actually inferes what should be an OpenMP reduction variable?
By the way, is there any way to make Cython compile OpenMP pragmas into
the code?
Things like this:
#pragma omp barrier
#pragma omp critical
{
<suite>
}
#pragma omp atomic
#pragma omp single
{
<suite>
}
#pragma omp master
{
<suite>
}
For example (in Cython code):
with cython.parallel.atomic():
a += 1
with cython.parallel.critical():
pass
cython.parallel.barrier()
How does Cython decide what is to be used as ahred, private,
firstprivate, lastprivate, copyin, and so on?
What about an OpenMP flush?
#pragma omp flush(varname)
cython.parallel.flush(varname)
Sturla
This place has some documentation on that:
http://docs.cython.org/src/userguide/parallelism.html
Basically the rules are as follows:
- an inplace operator makes the variable a reduction
- assignment makes it lastprivate
If these cases don't apply, then the variable will be shared. The
index variables are always lastprivate.
flush, single, critical, atomic and barrier are not supported.
However, the code to trap exits (break, continue, return, propagating
exceptions) and dispatch afterwards is there, so this wouldn't be hard
to support. We could also do sections. Feel free to implement it, the
code is well-documented (look for Nodes.ParallelRangeNode). Otherwise
if nobody does it I'll implement it when I find the time.
If you desperately want it right now you might do something like cdef
extern from *: int OMP_CRITICAL "#pragma omp critical\n{"; int
OMP_CRITICAL_END "}" . Haven't tried it but it looks like a bit of a
disaster to me :) For critical you could also just use a Python or
OpenMP lock manually.
Just some words of the logic behind this: When using prange, you declare
that all iterations can be run out of order. So *any* data-dependency
between iterations is disallowed. Assigning a variable within a prange
loop body is only to compute things in the iteration; not for communication.
NOTE: Very unlike OpenMP, Cython complains if it can spot a
cross-iteration dependency, and just to be sure, it initializes all
variables that are assigned to in the loop body to invalid values at the
beginning of each iteration (NaN for floating point).
From that point of view, we don't have "shared or private" variables --
the semantics is to pretend to be a sequential loop, just to do it faster.
For "advanced users" there's the parallel block. E.g., if you want to
count how many iterations each thread does, you can do
cdef int itcount
cdef int *p_itcount
with parallel():
itcount = 0 # thread-private, read-only in loop
p_itcount = &itcount
for i in prange(n):
p_itcount[0] += 1
with gil:
print "Ran for ", itcount, "iterations"
Same with shared variables: Use a pointer. Whether explicit syntax for
shared and proper thread-private variables should be introduced is an
open question, but I think pointers work well and is easier to spot.
Dag Sverre
On this meta-note: we did think about going completely the other
direction and simply providing a way for users to emit (OpenMP)
pragmas directly. Of course there were the the obvious issues, such as
variable-name mangling and block delimiters, but at a higher level
simply providing the tools to "make Cython emit the right C code" just
didn't feel like a clean solution. Instead we opted for a higher-level
construct that naturally covers the majority of usecases. The fact
that we use OpenMP is more of an implementation detail than how one
should think about using it.
- Robert
Definitely the right call if you ask me as a cython-user. With very
few changes to my original code I achieved a major speed-up. Plus, for
quite some time matlab users have been taunting me with matlab's pfor
loop, but not anymore ;).
-Thomas