Can we remove the FastGIL implementation?

46 views
Skip to first unread message

Stefan Behnel

unread,
Sep 19, 2023, 2:57:03 AM9/19/23
to Cython-users, Cython-devel
Hi,

I've seen reports that Cython's "FastGIL" implementation (which basically
keeps the GIL state in a thread-local variable) is no longer faster than
CPython's plain GIL implementation in recent Python 3.x versions.
Potentially even slower. See the report in

https://github.com/cython/cython/issues/5703

It would be helpful to get user feedback on this.

If you have GIL-heavy Cython code, especially with nested
with-nogil/with-gil sections across functions, and a benchmark that
exercises it, could you please run the benchmark with and without the
feature enabled and report the results?

You can add "-DCYTHON_FAST_GIL=0" to your CFLAGS to disabled it (and "=1"
to enable it explicitly). It's enabled by default in CPython 3.6-3.11 (but
disabled in Cython 0.29.x on Python 3.11).

Thanks,
Stefan

da-woods

unread,
Sep 19, 2023, 3:38:19 PM9/19/23
to cython...@python.org, cython...@googlegroups.com
I think the detail that was missing is you need to add the `#cython: fast_gil = True` to enable it.

For me:
Python 3.9 and 3.10 are basically identical (on master)

**test_gil_already_held**
with fast_gil
Running the test...
took 0.175062894821167
without
Running the test...
took 0.10976791381835938

**test_gil_released**
with fast_gil
Running the test...
took 0.583066463470459
without
Running the test...
took 0.5824759006500244

test_gil_already_held is noticably faster with fast_gil.

For Python 3.11:
I get the crash in 0.29.x if I try to run using fast_gil. No defines are needed to get that...
On master:

**test_gil_already_held**
with fast_gil
Running the test...
took 0.17254948616027832
without
Running the test...
took 0.10958600044250488

**test_gil_released**
with fast_gil
Running the test...
took 0.5791811943054199
without
Running the test...
took 0.5597968101501465

Note that "without fastgil" is now as fast as "fastgil" used to be. As fastgil is now slower. This is reproducible.

On Python 3.12 on master they're identical by default (which makes sense since I think we disable it). Defining -DCYTHON_FAST_GIL brings us back to roughly the same as 3.11 (i.e. now slower).

So my conclusion is that from 3.11 onwards Python sped up their own GIL handling to about the same as we used to have, and fastgil has turned into a pessimization.

David




On 19/09/2023 11:58, Lisandro Dalcin wrote:
Disclaimer: I may be doing something wrong, I did not put a lot of effort into it. 
With the microbenchmark that was offered in the GH issue, I see little difference.
Use the attached zip file to reproduce yourself. 
Change tox.ini to "cython<3" to try 0.29.x. 
BTW, in the 0.29.x case, I see no compilation error as claimed in the GH issue.

$ ./run.sh
CFLAGS=-g0 -Ofast -DCYTHON_FAST_GIL=0
Running test_gil_already_held ... took 0.08735537528991699
Running test_gil_released     ... took 0.6329536437988281
py37: OK ✔ in 3.57 seconds
Running test_gil_already_held ... took 0.09007453918457031
Running test_gil_released     ... took 0.4598276615142822
py38: OK ✔ in 3.19 seconds
Running test_gil_already_held ... took 0.10935306549072266
Running test_gil_released     ... took 0.4512367248535156
py39: OK ✔ in 3.25 seconds
Running test_gil_already_held ... took 0.09970474243164062
Running test_gil_released     ... took 0.46637773513793945
py310: OK ✔ in 3.21 seconds
Running test_gil_already_held ... took 0.08569073677062988
Running test_gil_released     ... took 0.46811795234680176
py311: OK ✔ in 3.22 seconds
Running test_gil_already_held ... took 0.15221118927001953
Running test_gil_released     ... took 0.2246694564819336
  py37: OK (3.57 seconds)
  py38: OK (3.19 seconds)
  py39: OK (3.25 seconds)
  py310: OK (3.21 seconds)
  py311: OK (3.22 seconds)
  pypy3.9: OK (5.24 seconds)
  congratulations :) (21.71 seconds)
CFLAGS=-g0 -Ofast -DCYTHON_FAST_GIL=1
Running test_gil_already_held ... took 0.08835673332214355
Running test_gil_released     ... took 0.6265637874603271
py37: OK ✔ in 1.42 seconds
Running test_gil_already_held ... took 0.09030938148498535
Running test_gil_released     ... took 0.456279993057251
py38: OK ✔ in 1.17 seconds
Running test_gil_already_held ... took 0.10986089706420898
Running test_gil_released     ... took 0.45894527435302734
py39: OK ✔ in 1.2 seconds
Running test_gil_already_held ... took 0.10107588768005371
Running test_gil_released     ... took 0.5052204132080078
py310: OK ✔ in 1.21 seconds
Running test_gil_already_held ... took 0.08566665649414062
Running test_gil_released     ... took 0.4581136703491211
py311: OK ✔ in 1.13 seconds
Running test_gil_already_held ... took 0.15286779403686523
Running test_gil_released     ... took 0.22533607482910156
  py37: OK (1.42 seconds)
  py38: OK (1.17 seconds)
  py39: OK (1.20 seconds)
  py310: OK (1.21 seconds)
  py311: OK (1.13 seconds)
  pypy3.9: OK (1.64 seconds)
  congratulations :) (7.81 seconds)


_______________________________________________
cython-devel mailing list
cython...@python.org
https://mail.python.org/mailman/listinfo/cython-devel


--
Lisandro Dalcin
============
Senior Research Scientist
Extreme Computing Research Center (ECRC)
King Abdullah University of Science and Technology (KAUST)
http://ecrc.kaust.edu.sa/

_______________________________________________
cython-devel mailing list
cython...@python.org
https://mail.python.org/mailman/listinfo/cython-devel


da-woods

unread,
Sep 19, 2023, 3:53:21 PM9/19/23
to cython...@googlegroups.com, cython...@python.org
One more detail - on 0.29.x it becomes a pessimization in Python 3.10 rather than Python 3.11. So in conclusion

        |  Python <3.10        | Python 3.10         | Python 3.11       | Python 3.12b2
-----------------------------------------------------------------------------------------
0.29.x  |  fast_gil is better  | fast_gil is worse   | fast_gil crashes  | fast_gil crashes
master  |  fast_gil is better  | fast_gil is better  | fast_gil is worse | fast_gil is worse (but off by default)
--

---
You received this message because you are subscribed to the Google Groups "cython-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cython-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cython-users/1b62cca9-7a5d-6bf2-a801-d59ef4a3f553%40d-woods.co.uk.


Stefan Behnel

unread,
Sep 20, 2023, 5:27:43 AM9/20/23
to cython...@python.org, Cython-users
da-woods schrieb am 19.09.23 um 21:38:
> I think the detail that was missing is you need to add the `#cython:
> fast_gil = True` to enable it.
> [...]
> So my conclusion is that from 3.11 onwards Python sped up their own GIL
> handling to about the same as we used to have, and fastgil has turned into
> a pessimization.

I tried the benchmark with the master branch on my side again, this time
with correct configuration. :)

Turns out that enabling the FastGIL feature makes it much slower for me (on
Ubuntu Linux 20.04) in both Py3.8 and 3.10:

"""
* Python 3.10 (-DCYTHON_FAST_GIL=0)
Running the test (already held)...
took 1.2482502460479736
Running the test (released)...
took 6.444956541061401
Running the test (already held)...
took 1.2358744144439697
Running the test (released)...
took 6.4064109325408936

* Python 3.10 (-DCYTHON_FAST_GIL=1)
Running the test (already held)...
took 2.243091583251953
Running the test (released)...
took 7.32707667350769
Running the test (already held)...
took 2.4065449237823486
Running the test (released)...
took 7.50264573097229
"""

I also tried it with PGO enabled and got more or less the same result. The
Python installations that I tried it with were both PGO builds.

It's probably mixed across platforms, different configurations and C
compilers. I looked through the "What's new" document for Py3.10 and 3.11
but couldn't find mentions of GIL improvements. Just that some other things
have become faster.

So – disable the feature in Python 3.11 and later? (Currently it's disabled
in 3.12+.)

Py3.11+ would suggest that we keep the code in Cython 3.1, since that will
support older Python versions that still seem to benefit from it.

Stefan

da-woods

unread,
Sep 20, 2023, 2:40:22 PM9/20/23
to cython...@googlegroups.com, cython...@python.org
On 20/09/2023 10:27, Stefan Behnel wrote:
> So – disable the feature in Python 3.11 and later? (Currently it's
> disabled in 3.12+.)
>
That seems sensible.

I think the other question is 0.29.x. On Python 3.11+ it silently
produces code that crashes at runtime. We should probably disable it
there (at least if there is another 0.29.x release).

David


Reply all
Reply to author
Forward
0 new messages