[Django] #9400: flock causes problems when writing to an NFS share

22 views
Skip to first unread message

Django

unread,
Oct 20, 2008, 8:36:59 AM10/20/08
to djang...@holovaty.com, django-...@googlegroups.com
#9400: flock causes problems when writing to an NFS share
---------------------------+------------------------------------------------
Reporter: mikeh | Owner: nobody
Status: new | Milestone:
Component: Uncategorized | Version: 1.0
Keywords: | Stage: Unreviewed
Has_patch: 0 |
---------------------------+------------------------------------------------
Hi,

This seems to be the same behaviour as reported in #8403, but as that
ticket has been closed as there was a request not to reopen it, here's a
new ticket.

We have a media directory mounted over NFS. Our system is RHEL5.2, Python
2.4, Django-1.0. Saving a file through the standard FileField mechanisms
(we're not using any custom storage backends, just out of the box django
setup stuff) results in the following :

{{{

File "/usr/lib64/python2.4/site-packages/mod_python/apache.py", line 299,
in HandlerDispatch?

result = object(req)

File "/usr/lib/python2.4/site-packages/django/core/handlers/modpython.py",
line 222, in handler

return ModPythonHandler?()(req)

File "/usr/lib/python2.4/site-packages/django/core/handlers/modpython.py",
line 195, in call

response = self.get_response(request)

File "/usr/lib/python2.4/site-packages/django/core/handlers/base.py", line
128, in get_response

return self.handle_uncaught_exception(request, resolver, exc_info)

File "./../apps/dave_common/init.py", line 20, in new
File "/usr/lib/python2.4/site-packages/django/core/handlers/base.py", line
86, in get_response

response = callback(request, *callback_args, **callback_kwargs)

File "/usr/lib/python2.4/site-packages/django/contrib/admin/sites.py",
line 158, in root

return self.model_page(request, *url.split('/', 2))

File "/usr/lib/python2.4/site-packages/django/views/decorators/cache.py",
line 44, in _wrapped_view_func

response = view_func(request, *args, **kwargs)

File "/usr/lib/python2.4/site-packages/django/contrib/admin/sites.py",
line 177, in model_page

return admin_obj(request, rest_of_url)

File "/usr/lib/python2.4/site-packages/django/contrib/admin/options.py",
line 191, in call

return self.add_view(request)

File "/usr/lib/python2.4/site-packages/django/db/transaction.py", line
238, in _commit_on_success

res = func(*args, **kw)

File "/usr/lib/python2.4/site-packages/django/contrib/admin/options.py",
line 492, in add_view

new_object = self.save_form(request, form, change=False)

File "/usr/lib/python2.4/site-packages/django/contrib/admin/options.py",
line 370, in save_form

return form.save(commit=False)

File "/usr/lib/python2.4/site-packages/django/forms/models.py", line 302,
in save

return save_instance(self, self.instance, self._meta.fields, fail_message,
commit)

File "/usr/lib/python2.4/site-packages/django/forms/models.py", line 47,
in save_instance

f.save_form_data(instance, cleaned_data[f.name])

File "/usr/lib/python2.4/site-packages/django/db/models/fields/files.py",
line 192, in save_form_data

getattr(instance, self.name).save(data.name, data, save=False)

File "/usr/lib/python2.4/site-packages/django/db/models/fields/files.py",
line 217, in save

super(ImageFieldFile?, self).save(name, content, save)

File "/usr/lib/python2.4/site-packages/django/db/models/fields/files.py",
line 74, in save

self._name = self.storage.save(name, content)

File "/usr/lib/python2.4/site-packages/django/core/files/storage.py", line
45, in save

name = self._save(name, content)

File "/usr/lib/python2.4/site-packages/django/core/files/storage.py", line
159, in _save

locks.lock(fd, locks.LOCK_EX)

File "/usr/lib/python2.4/site-packages/django/core/files/locks.py", line
57, in lock

fcntl.lockf(fd(file), flags)

IOError: [Errno 37] No locks available
}}}

The default with RHEL5.2 is NFSv3, and that's what we're using.

Cheers,

Mike

--
Ticket URL: <http://code.djangoproject.com/ticket/9400>
Django <http://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Oct 23, 2008, 10:59:48 PM10/23/08
to djang...@holovaty.com, django-...@googlegroups.com
#9400: flock causes problems when writing to an NFS share
------------------------------------+---------------------------------------
Reporter: mikeh | Owner: nobody
Status: new | Milestone:
Component: Uncategorized | Version: 1.0
Resolution: | Keywords:
Stage: Unreviewed | Has_patch: 0
Needs_docs: 0 | Needs_tests: 0
Needs_better_patch: 0 |
------------------------------------+---------------------------------------
Changes (by mtredinnick):

* needs_better_patch: => 0
* needs_tests: => 0
* needs_docs: => 0

Comment:

So we're in an impossible situation here then. `lockf()` doesn't work
everywhere, `flock()` doesn't work everywhere. And there's no way to know
which one works.

Since `lockf()` -- the way Django currently does things -- is the
recommended approach to doing portable locking and it should work with NFS
(I made sure and read the Python source before making the change), I'm
inclined to leave the current behaviour in place until a more robust
solution emerges.

Thus, we'll need more information and investigation from you on this one.
For example, does changing the `lockf()` call to `flock()` also fail? Do
you have `statd` running on the server (so that locking is available --
since that was one of the problems in a Debian case, for example)? What
information can you track down about why one version works somewhere and
the other version works (if it does) on other NFS servers? What's the
differentiating feature?

Sorry to push the research back in your direction, but right now Django's
doing the best it can as far as following recommended practices and the
current code certainly avoided the problems that were reported earlier.
Yours is the first case that's been reported of it not working on a
reliable NFS setup with the current code, so you have the (only?) failing
test case and will need to work out what's going on. I'm far beyond being
able to guess.

--
Ticket URL: <http://code.djangoproject.com/ticket/9400#comment:1>

Django

unread,
Nov 13, 2008, 4:54:21 AM11/13/08
to djang...@holovaty.com, django-...@googlegroups.com
#9400: flock causes problems when writing to an NFS share
------------------------------------+---------------------------------------
Reporter: mikeh | Owner: nobody
Status: new | Milestone:
Component: Uncategorized | Version: 1.0
Resolution: | Keywords:
Stage: Unreviewed | Has_patch: 0
Needs_docs: 0 | Needs_tests: 0
Needs_better_patch: 0 |
------------------------------------+---------------------------------------
Comment (by rndblnch):

Replying to [comment:1 mtredinnick]:
> So we're in an impossible situation here then. `lockf()` doesn't work
everywhere, `flock()` doesn't work everywhere. And there's no way to know
which one works.
>
> Since `lockf()` -- the way Django currently does things -- is the
recommended approach to doing portable locking and it should work with NFS
(I made sure and read the Python source before making the change), I'm
inclined to leave the current behaviour in place until a more robust
solution emerges.
>
> Thus, we'll need more information and investigation from you on this
one. For example, does changing the `lockf()` call to `flock()` also fail?
Do you have `statd` running on the server (so that locking is available --
since that was one of the problems in a Debian case, for example)? What
information can you track down about why one version works somewhere and
the other version works (if it does) on other NFS servers? What's the
differentiating feature?
>
> Sorry to push the research back in your direction, but right now
Django's doing the best it can as far as following recommended practices
and the current code certainly avoided the problems that were reported
earlier. Yours is the first case that's been reported of it not working on
a reliable NFS setup with the current code, so you have the (only?)
failing test case and will need to work out what's going on. I'm far
beyond being able to guess.

#9433 points out a similar problem (although on afp mounts).
The patch it provides
(<http://code.djangoproject.com/attachment/ticket/9433/not_supported_locks.diff>)
may be adapted to also handle the "{{{IOError: [Errno 37] No locks
available}}}" error.

--
Ticket URL: <http://code.djangoproject.com/ticket/9400#comment:2>

Django

unread,
Feb 26, 2009, 1:46:52 PM2/26/09
to djang...@holovaty.com, django-...@googlegroups.com
#9400: flock causes problems when writing to an NFS share
---------------------------------------------+------------------------------
Reporter: mikeh | Owner: nobody
Status: new | Milestone:
Component: Uncategorized | Version: 1.0
Resolution: | Keywords:
Stage: Design decision needed | Has_patch: 0
Needs_docs: 0 | Needs_tests: 0
Needs_better_patch: 0 |
---------------------------------------------+------------------------------
Changes (by jacob):

* stage: Unreviewed => Design decision needed

--
Ticket URL: <http://code.djangoproject.com/ticket/9400#comment:3>

Django

unread,
May 6, 2009, 11:45:22 PM5/6/09
to djang...@holovaty.com, django-...@googlegroups.com
#9400: flock causes problems when writing to an NFS share
---------------------------------------------+------------------------------
Reporter: mikeh | Owner: nobody
Status: new | Milestone:
Component: File uploads/storage | Version: 1.0
Resolution: | Keywords:
Stage: Design decision needed | Has_patch: 0
Needs_docs: 0 | Needs_tests: 0
Needs_better_patch: 0 |
---------------------------------------------+------------------------------
Changes (by thejaswi_puthraya):

* component: Uncategorized => File uploads/storage

--
Ticket URL: <http://code.djangoproject.com/ticket/9400#comment:4>

Django

unread,
Jan 10, 2010, 1:26:40 PM1/10/10
to djang...@holovaty.com, django-...@googlegroups.com
#9400: flock causes problems when writing to an NFS share
---------------------------------------------+------------------------------
Reporter: mikeh | Owner: nobody
Status: new | Milestone:
Component: File uploads/storage | Version: 1.0
Resolution: | Keywords:
Stage: Design decision needed | Has_patch: 0
Needs_docs: 0 | Needs_tests: 0
Needs_better_patch: 0 |
---------------------------------------------+------------------------------
Comment (by worksology):

We are experiencing this same issue on our production environment, which
uses NFS. I believe this started once we upgraded to Django 1.1, so we
will likely rollback to Django 1.0 to avoid these fatal errors. Is there a
possible stop-gap (patch) that could avoid this error without reverting to
1.0? We'll be happy to be a second test case to help design a proper
solution to this problem.

--
Ticket URL: <http://code.djangoproject.com/ticket/9400#comment:5>

Django

unread,
Jan 10, 2010, 3:08:20 PM1/10/10
to djang...@holovaty.com, django-...@googlegroups.com
#9400: flock causes problems when writing to an NFS share
---------------------------------------------+------------------------------
Reporter: mikeh | Owner: nobody
Status: new | Milestone:
Component: File uploads/storage | Version: 1.0
Resolution: | Keywords:
Stage: Design decision needed | Has_patch: 0
Needs_docs: 0 | Needs_tests: 0
Needs_better_patch: 0 |
---------------------------------------------+------------------------------
Comment (by kmtracey):

Replying to [comment:5 worksology]:
> We are experiencing this same issue on our production environment, which
uses NFS. I believe this started once we upgraded to Django 1.1, so we
will likely rollback to Django 1.0 to avoid these fatal errors. Is there a
possible stop-gap (patch) that could avoid this error without reverting to
1.0? We'll be happy to be a second test case to help design a proper
solution to this problem.

There is no stopgap patch since so far as I can see no one with a failing
system has answered Malcolm's questions in
http://code.djangoproject.com/ticket/9400#comment:1. That comment lays
out some stuff you could try, and things you should check (i.e., that
locking is in fact available on this filesystem). Without further
information from people who actually experience this error there is not
much that Django can do to fix it.

--
Ticket URL: <http://code.djangoproject.com/ticket/9400#comment:6>

Django

unread,
Jan 22, 2010, 12:42:17 PM1/22/10
to djang...@holovaty.com, django-...@googlegroups.com
#9400: flock causes problems when writing to an NFS share
---------------------------------------------+------------------------------
Reporter: mikeh | Owner: nobody
Status: new | Milestone:
Component: File uploads/storage | Version: 1.0
Resolution: | Keywords:
Stage: Design decision needed | Has_patch: 0
Needs_docs: 0 | Needs_tests: 0
Needs_better_patch: 0 |
---------------------------------------------+------------------------------
Comment (by worksology):

Some more information for debugging:

Our environment uses a clustered NFS using nfs-utils-1.0.6-93.EL4 and
mounting using nfs version 3 with options:
rsize=32768,wsize=32768,tcp,nfsvers=3,hard,intr

I've patched our Django install to use flock() and it works again.

--
Ticket URL: <http://code.djangoproject.com/ticket/9400#comment:7>

Django

unread,
Sep 3, 2010, 11:34:07 AM9/3/10
to djang...@holovaty.com, django-...@googlegroups.com
#9400: flock causes problems when writing to an NFS share
---------------------------------------------+------------------------------
Reporter: mikeh | Owner: nobody
Status: new | Milestone:
Component: File uploads/storage | Version: 1.0
Resolution: | Keywords:
Stage: Design decision needed | Has_patch: 0
Needs_docs: 0 | Needs_tests: 0
Needs_better_patch: 0 |
---------------------------------------------+------------------------------
Comment (by dougvanhorn):

I was just bitten by this issue (Error 37). However, it was caused by the
__NFS 3 Client__ not having a running nfslock service ($ sudo
/sbin/service nfslock start)

My system and NFS:

Red Hat Enterprise Linux Server release 5.4 (Tikanga)
Linux 2.6.18-164.15.1.el5 #1 SMP Mon Mar 1 10:56:08 EST 2010 x86_64 x86_64
x86_64 GNU/Linux
nfs-utils.x86_64 1:1.0.9-42.el5


As an FYI, NFS 2 and NFS 3 require the third party locking service,
whereas NFS 4 has locking built into the protocol.


I'll attach a small script which tests the locking behavior directly, so
you can run the script while testing your NFS configuration. It's a cut
and paste of the locks behavior as of 1.2.1.

--
Ticket URL: <http://code.djangoproject.com/ticket/9400#comment:8>

Django

unread,
Nov 24, 2010, 11:56:39 AM11/24/10
to djang...@holovaty.com, django-...@googlegroups.com
#9400: flock causes problems when writing to an NFS share
---------------------------------------------+------------------------------
Reporter: mikeh | Owner: nobody
Status: new | Milestone:
Component: File uploads/storage | Version: 1.0
Resolution: | Keywords:
Stage: Design decision needed | Has_patch: 0
Needs_docs: 0 | Needs_tests: 0
Needs_better_patch: 0 |
---------------------------------------------+------------------------------
Comment (by worksology):

It appears the root of our problem with lockf() is that one of our
machines was not running rpc.statd. Just posting in case this helps
anyone else with the NFS file-locking problem.

--
Ticket URL: <http://code.djangoproject.com/ticket/9400#comment:9>
Reply all
Reply to author
Forward
0 new messages