--
Ticket URL: <https://code.djangoproject.com/ticket/18855>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
* owner: nobody => dlamotte
* needs_docs: => 0
* status: new => assigned
* needs_tests: => 0
* needs_better_patch: => 0
--
Ticket URL: <https://code.djangoproject.com/ticket/18855#comment:1>
* has_patch: 0 => 1
Comment:
Sent pull request: https://github.com/django/django/pull/304
--
Ticket URL: <https://code.djangoproject.com/ticket/18855#comment:2>
* version: 1.4 => master
* stage: Unreviewed => Accepted
Comment:
It seems this could only be done on Unix platforms (at least socket.fromfd
is documented to be available only on Unix). The best option for platform
support is to write the code in a way where the code works when needed
socket abilities are available, but if they are not, then we still get a
working server, just without socket persistence capabilities.
If I understand the patch correctly, the socket is first created in the
main process. Then, it is passed to the server subprocess as os.environ
parameter. Finally the server subprocess will try to use the already
existing socket by using socket.fromfd().
So, in addition to the above platform problem, at least these possible
limitations come to mind:
- Could we leak the socket's fileno in to os.environ where it is
accidentally reused later on? Could we pass it to the server subprocess
some other way?
- It seems we will need to have socket creation logic in Django's code.
We don't have that now. This might cause us some headaches later on.
- Could the socket be left in a state where it contains cruft from the
old process (this might be a stupid question, I don't know socket
semantics too well)? What I mean here is that maybe the server process
reads half of an request, gets reloaded, and then the new process has a
broken half-request waiting for it? The same could happen for write-to-
client, too, if I am not mistaken.
I would like to get rid of the unavailability of the dev server during
reload, no question about that, but we should not commit this unless we
have pretty good guarantees that this will cause zero regressions on any
supported platforms. This is not worth risking breakages. I am marking
this accepted, but this is only on the grounds that we can make this
bulletproof.
--
Ticket URL: <https://code.djangoproject.com/ticket/18855#comment:3>
Comment (by dlamotte):
Thanks for the review. I've updated the pull request to address the
platform issue.
I dont know the platforms supported by Django, but I added a check to the
code to test if "fromfd" exists in the socket module. If it does, it will
use the new persist socket code. If not, the code will not be used. I
believe its the case that if "fromfd" is not supported, it will not exist,
however, I don't have a host to test this on.
Yes, the socket needs to be first created in the main process that
survives the reload process. This way, the socket stays open and
accepting connections.
* Not sure what you mean by leak? The child process that gets reloaded
would be the only process that could get the FD. The `spawnve` called in
the autoreloader ensures that the child's FDs are identical to that of its
parent, but besides this, its an entirely new process (it calls exec or
something similar for the platform). This means that no other process
state is saved from the parent. We can either pass the FD number via the
environment OR via the command line. It appears the autoreload module
uses an environment variable "RUN_MAIN" to tell the child process that it
should run the HTTP server. I think using another environment variable is
suitable to pass the FD also.
* I had not realized this. The socket is only created now, its still
configured the way it was before. This should make it less risky IMO.
* I don't believe it introduces any ''new'' risks. Specifically, a "half
write" could happen with the current implementation if its not handled
(I'm not familiar enough to know if that's the case). This would not
change that AFAIK. The child process would (during cleanup due to exit)
will close the client's connected socket just as if there wasn't a
persisted socket. As for reading a partial response, I believe we're safe
from this simply because the `.accept()` on this main server socket
returns a client socket which handles the connection. If it only reads
part of the response and the connection is interrupted due to reload, the
child exiting will close the socket and the pending data will be discarded
by the kernel. It will not remain pending on the server socket (this is
guaranteed by TCP I believe, I don't believe this would be the case with
something like UDP for instance).
I'll do some looking into writing some tests along with this commit.
It'll be significantly interesting/complicated to test I think because of
the inherent race conditions involved. Any advice or ideas are welcomed.
Thanks again for your review and acceptance of this. I'll update this
request when I've looked into writing the tests and/or have written some.
--
Ticket URL: <https://code.djangoproject.com/ticket/18855#comment:4>
Comment (by bhuztez):
I think this is duplicate of #16049
--
Ticket URL: <https://code.djangoproject.com/ticket/18855#comment:5>
Comment (by dlamotte):
Yes. I didn't see that one before opening this one. However, at the
moment, this ticket has a corresponding patch/pull request. Its missing
tests, but I plan on fixing that. Also, I have been using the patch in my
own development with success and zero issues.
--
Ticket URL: <https://code.djangoproject.com/ticket/18855#comment:6>
* needs_docs: 0 => 1
* needs_tests: 0 => 1
Comment:
This is very interesting. To move forwards it just needs:
- docs — including an explanation that it doesn't work on Windows with
Python 2 (see discussion on #16049)
- tests — or an explanation of why it's impossible to test
In the meantime, I've closed the pull request, because our triage options
on GitHub are limited and we don't keep pull requests with partial
patches.
Please re-open it or make a new PR when you get a chance to make these
improvements!
--
Ticket URL: <https://code.djangoproject.com/ticket/18855#comment:7>
Comment (by dlamotte):
Sorry this has been inactive so long. Life has been pretty busy. The
testing part is a bit challenging simply because it's very "race-
condition-y". I have used the patch in development since I initially
wrote it (and I write code everyday with Django) so I'm very certain it
works without regressions, however, I can understand adding tests to the
code base.
I'll get going on the docs.
--
Ticket URL: <https://code.djangoproject.com/ticket/18855#comment:8>
Comment (by dlamotte):
Created new pull request: https://github.com/django/django/pull/893
Added documentation and brought the branch up to date with the latest
master branch.
I explain in the pull request why I think it'll be ok to not add tests
(given the difficulty). But if there are suggestions, I can surely look
into them and update the pull request.
--
Ticket URL: <https://code.djangoproject.com/ticket/18855#comment:9>
Comment (by dlamotte):
I think this is slipping through the cracks. This is my first
contribution to Django, so I'm probably making this harder than it needs
to be. Sorry about that.
Just wanted to check in and see what else I can do to get this into 1.6
(hopefully).
Thanks much for your help and patience.
--
Ticket URL: <https://code.djangoproject.com/ticket/18855#comment:10>
* cc: unai@… (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/18855#comment:11>
* needs_docs: 1 => 0
* needs_tests: 1 => 0
* stage: Accepted => Ready for checkin
Comment:
LGTM
--
Ticket URL: <https://code.djangoproject.com/ticket/18855#comment:12>
* cc: apollo13 (added)
Comment:
What's the reason for adding an option to disable that feature? I can't
imagine why someone would like to disable it if it works.
--
Ticket URL: <https://code.djangoproject.com/ticket/18855#comment:13>
Comment (by dlamotte):
Could have sworn I added it because I was asked to, but cannot seem to
find that at the moment. I guess it's only there so if for some reason
its causing more problems than it solves for that user, they can disable
it.
--
Ticket URL: <https://code.djangoproject.com/ticket/18855#comment:14>
* needs_better_patch: 0 => 1
* stage: Ready for checkin => Accepted
Comment:
I wonder if our time would be better spent in efforts like #21978 -- "Add
optional gunicorn support to runserver", rather than trying to enhance our
own HTTP server?
In any case, the patch no longer merges cleanly according to GitHub so
bumping off RFC.
--
Ticket URL: <https://code.djangoproject.com/ticket/18855#comment:15>
* stage: Accepted => Ready for checkin
Comment:
I've just updated the pull request to rebase off of the latest master.
Please re-consider this patch.
I think the ticket you reference is great, however, this patch is ready
now and it addresses a common problem that many would appreciate being
fixed.
--
Ticket URL: <https://code.djangoproject.com/ticket/18855#comment:16>
* stage: Ready for checkin => Accepted
Comment:
Did you actually check that this works fine on Python3.4?
--
Ticket URL: <https://code.djangoproject.com/ticket/18855#comment:17>
Comment (by unaizalakain):
Once addressed the notes added to the PR patch, it works like a charm with
3.4
--
Ticket URL: <https://code.djangoproject.com/ticket/18855#comment:18>
Comment (by timo):
As noted on the pull request, this causes `Address already in use` errors
locally when running `./runtests.py admin_widgets --selenium
--liveserver=localhost:8090-8100` in parallel (as well as on Jenkins). It
seems that the alternate ports are never tried.
--
Ticket URL: <https://code.djangoproject.com/ticket/18855#comment:19>
* owner: Dan LaMotte => (none)
* status: assigned => new
--
Ticket URL: <https://code.djangoproject.com/ticket/18855#comment:20>