After upgrading we encountered several errors when running the application
in gunicorn with the gevent worker class.
Summary of errors:
- gevent._socketcommon.cancel_wait_ex: [Errno 9] File descriptor was
closed in another greenlet
- gevent.exceptions.ConcurrentObjectUseError: This socket is already used
by another greenlet: <bound method Waiter.switch of
<gevent._gevent_c_waiter.Waiter object at 0x7f8e77b00a40>>
- OSError: [Errno 9] Bad file descriptor
These errors seem to be related to either the Django backend
implementation or Pymemcache not handling multi-threading/thread-safety
properly.
There is a related bug for the Pymemcache library where a member of that
team states that is up the application using Pymemcache to handle thread-
safety
(https://github.com/pinterest/pymemcache/issues/195#issuecomment-452523524).
In this case the Django framework. This comment in Django's
BaseMemcachedCache implementation indicates that that was the original
intent:
https://github.com/django/django/blob/main/django/core/cache/backends/memcached.py#L38
so i think PyMemcacheCache should handle this.
Example project that reproduces the error:
https://github.com/mvanderblom/django-memcached-bugreport
For us, this error prevents us from using the PyMemcacheCache backend and
thus from upgrading to Django 4.1 when it gets released.
--
Ticket URL: <https://code.djangoproject.com/ticket/33092>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
* Attachment "example.log" added.
Log file that contains the log from a run of the example project. The log
file contains the full stacktraces for this error.
* cc: Andrew Godwin, Nick Pope (added)
Comment:
Thanks for the report. This can be also an issue with
`asgiref.local.Local`. Can you confirm that this issue still exists with
`asgiref==3.4.1` and on the Django's `main` branch?
--
Ticket URL: <https://code.djangoproject.com/ticket/33092#comment:1>
* type: Bug => Cleanup/optimization
* stage: Unreviewed => Accepted
Old description:
New description:
For thread-safety when using `pymemcache` the option `'use_pooling': True`
can be passed via `OPTIONS` which will make `pymemcache.HashClient` use
`pymemcache.PooledClient` instead of `pymemcache.Client` internally.
Some documentation should be added or improved.
----
--
Comment:
So, yes, `pymemcache.Client` is not thread-safe. We are using
`pymemcache.HashClient` so that we can support connections to multiple
servers.
I note that `pymemcache.PooledClient` is thread-safe according to the
[https://pymemcache.readthedocs.io/en/latest/getting_started.html?highlight=thread#using-a
-client-pool documentation]. We can pass the `use_pooling` flag to
`HashClient`. Unfortunately `pymemcache`'s documentation is a little
sparse!
If I add `'OPTIONS': {'use_pooling': True}` to your `CACHES` configuration
in your reproducer the problem goes away for me.
Would you be prepared to open a PR with a tweak to the documentation? I
already mentioned `use_pooling` in at the end of the
[https://docs.djangoproject.com/en/3.2/topics/cache/#cache-arguments cache
arguments] section, so maybe we just need to amend the sentence before?
--
Ticket URL: <https://code.djangoproject.com/ticket/33092#comment:2>
* Attachment "use_pooling_true.log" added.
Logfile of a run with use_pooling set to true
Comment (by Martijn van der Blom):
Thanks for the quick reply! I've tried your suggested solution, but
although les frequent (3 vs 31 times), the error still occurs.
I've verified that the config actually causes the HashClient to use a
PooledClient and it does.
I've updated the example in the git repo and attached a log to this issue.
Do you guys have any clue as to what's happening here or any hints about
what to do next?
--
Ticket URL: <https://code.djangoproject.com/ticket/33092#comment:3>
Comment (by Nick Pope):
That's interesting - I'll have to have another try later.
I don't know a great deal about this... Perhaps the issue is that
`PooledClient` is using `threading.Lock`. See
[https://github.com/pinterest/pymemcache/blob/b6d83f89fb93dfe7a006b470af487f113739cc56/pymemcache/pool.py#L32-L35
here].
As mentioned
[https://github.com/gevent/gevent/issues/1381#issuecomment-478707377 here]
some monkey patching needs to be done with `gevent.monkey.patch_all()`,
but it seems that is
[https://github.com/benoitc/gunicorn/blob/f145e90e322256f1c4e1eea8e175d2e188ceab43/gunicorn/workers/ggevent.py#L38
already handled] by gunicorn for the gevent worker.
Perhaps you can try passing `lock_generator` in `OPTIONS` with a different
lock type, e.g. see https://www.gevent.org/api/gevent.lock.html
--
Ticket URL: <https://code.djangoproject.com/ticket/33092#comment:4>
Comment (by Martijn van der Blom):
I've tried passing gevent.lock.BoundedSemaphore as a lock_generator, but
unfortunately that didn't make any difference. As you say, the gevent
worker already calls monkey.patch_all() that should cause all usages of
threading.[R]Lock to use the gevent variant. I'm still not sure how to
proceed. Any further help would be much appreciated.
--
Ticket URL: <https://code.djangoproject.com/ticket/33092#comment:5>
Comment (by Martijn van der Blom):
Hi Nick, did you get a chance to look at this issue again? I'm still in
the dark as to how to resolve this issue.
For now this prevents me from using the suggested alternative to
MemcachedCache and when time comes, it will prevent me (and probably
others) to update to Django 4.1.
If you want me to investigate something, please let me know.
--
Ticket URL: <https://code.djangoproject.com/ticket/33092#comment:6>
* type: Cleanup/optimization => Bug
Comment:
I really don't think this issue can be resolved with only changes to the
docs.
--
Ticket URL: <https://code.djangoproject.com/ticket/33092#comment:7>
Comment (by Iuri de Silvio):
I'm not sure it is related, but I found CacheMiddleware is not thread-safe
and provided a solution in #33252.
--
Ticket URL: <https://code.djangoproject.com/ticket/33092#comment:8>
Comment (by Martijn van der Blom):
I can still reproduce the issue in Django 3.2.10 so it doesn't seem to be
fixed by #33252
--
Ticket URL: <https://code.djangoproject.com/ticket/33092#comment:9>
Comment (by Nick Pope):
Replying to [comment:9 Martijn van der Blom]:
> I can still reproduce the issue in Django 3.2.10 so it doesn't seem to
be fixed by #33252
That fix will not be backported to the 3.2.x series according to
[https://docs.djangoproject.com/en/4.0/internals/release-process
/#supported-versions-policy our backport policy].
It would be handy if you can run your test against the master branch to
see if #33252 resolves the issue.
--
Ticket URL: <https://code.djangoproject.com/ticket/33092#comment:10>
Comment (by GiacomoP):
Will this be fixed in future releases of Django 4? I'm encountering the
exact same issue with Django + mod_wsgi
Django==4.0
asgiref==3.4.1
pymemcache==3.5.0
Setting the "use_pooling" option helped, but didn't get rid of all the
errors.
--
Ticket URL: <https://code.djangoproject.com/ticket/33092#comment:11>
Comment (by Aaron Mader):
I've set up a simple project to reproduce this error
[https://github.com/aaronmader/testing_33092 here].
From my testing: This issue still occurs on django==4.0.7, but appears to
be resolved on django==4.1.0. Perhaps this is the first version to include
the fix from #33252? (I'm not sure how to tell which release(s) contains
the fix made for that ticket).
--
Ticket URL: <https://code.djangoproject.com/ticket/33092#comment:12>
Comment (by aseem-md):
Another issue I recently ran into is that
`pymemcache.base.client.PooledClient.gets` does not have the same
signature as `pymemcache.base.client.Client.gets`. The latter accepts
`default` as a keyword argument whereas the former does not.
When combined with the recommend configuration ignoring exceptions by
setting `ignore_exc: True` you can run into a situation where calling
`gets` there is an error that gets swallowed despite the value being in
the cache and you get nothing back. This took a while to track down and
debug.
--
Ticket URL: <https://code.djangoproject.com/ticket/33092#comment:13>
> From my testing: This issue still occurs on django==4.0.7, but appears
to be resolved on django==4.1.0. Perhaps this is the first version to
include the fix from #33252?
The #33252 merge commit is 3ff7b15bb79f2ee5b7af245c55ae14546243bb77. It is
tagged from 4.1a1.
--
Ticket URL: <https://code.djangoproject.com/ticket/33092#comment:14>