Converting to Python objects with nogil (inside prange for loop)

265 views
Skip to first unread message

Rishu Garg

unread,
Jul 15, 2022, 12:07:19 AM7/15/22
to cython-users
Hey everyone,

I am trying to parallelize a for loop, and am using with nogil, parallel().

Snippet of my code -- 

cdef npfloat_t[:] buildSigma(
  npfloat_t[:,:] &DATA[0],
  npfloat_t[:] &BESTdistances[0],
  bint useAbsolute):

  cdef int N = DATA.shape[0]
  cdef npfloat_t[:] referencePoints

  ''' some code"""

cdef float cost

with nogil, parallel():
  for i in prange(N):
    for j in prange(km.BatchSize):
      cost = 3; #Todo: Add the cached function here.
      if(km.useAbsolute):
        sample[j] = cost
      else:
        sample[j] = cost if cost < BESTdistances[referencePoints[j]] else    BESTdistances[referencePoints[j]]

    updated_sigma[i] = np.std(sample) # ERROR HERE
return updated_sigma

Error: updated_sigma[i] = np.std(sample)
                               ^
------------------------------------------------------------

Converting to Python object not allowed without gil

Can anyone help out in finding a workaround? Would appreciate a lot!

Thanks,
Rishu

da-woods

unread,
Jul 15, 2022, 2:44:07 AM7/15/22
to cython...@googlegroups.com

> with nogil, parallel():
>   for i in prange(N):
>     for j in prange(km.BatchSize):

You usually only want one loop in a set of nested loops to be prange.
Typically the outer loop, but in this case it might be easier to
parallelize the inner loop.

> Error: updated_sigma[i] = np.std(sample)
>                                ^
> ------------------------------------------------------------
>
> Converting to Python object not allowed without gil

You've potentially got 2 problems here:

* I can't see a definition of updated_sigma so it's possible that
Cython's just treating it as a regular Python object (which you can't
use in a nogil block)

* np.std is a Python function call. It can't go in a nogil block.

Your options are:

1. parallelize the inner loop only instead and keep np.std outside the
nogil block. I don't know if there enough work in the inner loop to be
worthwhile.

2. write your own code for the standard deviation skipping the call to
np.std

3. Put the call to np.std inside a `with gil` block. It's fairly likely
enough of the work in inside np.std that this'll make the
parallelization pointless, but it's sometimes a useful approach when you
have a small section that needs the GIL.

4. Drop the parallelization and write the code using the GIL.

Stefan Behnel

unread,
Jul 15, 2022, 3:45:43 AM7/15/22
to Cython-devel, Cython-users
Hi,

nested prange loops seem to be a common gotcha for users. I can't say if
there is ever a reason to do this, but at least I can't think of any. For
me, this sounds like we should turn it into a compile time error – unless
someone can think of a use case? Even in that case, I'd still emit a
warning since it seems so unlikely to be intended.

Please reply to the cython-users list to facilitate user feedback.

Stefan

da-woods

unread,
Jul 15, 2022, 4:07:00 AM7/15/22
to cython...@googlegroups.com
Hi Stefan,

It is possible to do useful things with multiple loops, although
probably not the way that Cython writes it currently.

Modern versions of OpenMP support a "collapse" option, which runs the
parallelization over several loops. I think that's useful if you're
iterating element-by-element over a 2x100 array (for example), that just
parallelizing the 2-iteration won't be very good, while parallelizing
200 will work well.

I suspect Cython doesn't write the loops "cleanly" enough for it to
usefully use that though. And it isn't what happens when you do
"parallel-in-parallel" - you need to add the option explicitly on the
outer loop.

So yes, I agree - it should at least be a warning. I don't think it does
much harm though so probably doesn't need to be an error.

David

David Menéndez Hurtado

unread,
Jul 15, 2022, 4:37:05 AM7/15/22
to cython...@googlegroups.com


On Fri, 15 Jul 2022, 09:45 Stefan Behnel, <stef...@behnel.de> wrote:
Hi,

nested prange loops seem to be a common gotcha for users. I can't say if
there is ever a reason to do this, but at least I can't think of any. For
me, this sounds like we should turn it into a compile time error – unless
someone can think of a use case? Even in that case, I'd still emit a
warning since it seems so unlikely to be intended.

One use case is a nested loop with very different sized jobs. I want to have as many jobs waiting in parallel as possible to limit dead time.

Another use case are arrays with many small-ish dimensions that aren't a multiple of the number of cores. Parallelising along any of them leaves idle cores.

I don't know if prange is the right tool for the job, though.

/David 


Please reply to the cython-users list to facilitate user feedback.

Stefan



-------- Forwarded Message --------
Subject: Re: [cython-users] Converting to Python objects with nogil (inside
prange for loop)
Date: Fri, 15 Jul 2022 07:43:26 +0100


> with nogil, parallel():
>   for i in prange(N):
>     for j in prange(km.BatchSize):

You usually only want one loop in a set of nested loops to be prange.
Typically the outer loop, but in this case it might be easier to
parallelize the inner loop.

--

---
You received this message because you are subscribed to the Google Groups "cython-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cython-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cython-users/6d788696-0dcc-49e0-b20d-8cfa54c97b19%40behnel.de.

Rishu Garg

unread,
Jul 18, 2022, 1:24:41 AM7/18/22
to cython-users
Thanks, everyone for the response!!

Jérôme Kieffer

unread,
Aug 20, 2022, 8:39:37 AM8/20/22
to cython...@googlegroups.com
On Fri, 15 Jul 2022 09:45:37 +0200
Stefan Behnel <stef...@behnel.de> wrote:

> Hi,
>
> nested prange loops seem to be a common gotcha for users. I can't say if
> there is ever a reason to do this, but at least I can't think of any. For
> me, this sounds like we should turn it into a compile time error – unless
> someone can think of a use case? Even in that case, I'd still emit a
> warning since it seems so unlikely to be intended.
>
> Please reply to the cython-users list to facilitate user feedback.
>
> Stefan

Hi Stephan,

I always believed OpenMP was clever enough to parallelize only the outer loop.

I run into problems when parallelizing calls to blas which could be
parallel or not (depending on the implementation).
I noticed that a nogil section parallelized with a (python)
thread-pool was most of the time as efficient and sometimes much faster!

Cheers,

Jerome
Reply all
Reply to author
Forward
0 new messages